If you live in Hawaii, congratulations: according to a new study by researchers at the University of Vermont, you live in the happiest state in the union—at least as far as Twitter sentiment is concerned. (Hat tip to The Atlantic for posting about the research.)
The researchers—affiliated with the University’s Department of Mathematics & Statistics, Complex Systems Center, Computational Story Lab, and Advanced Computing Core—collected 10 million geo-tagged Tweets from 373 urban areas across the United States in 2011. According to the study, determining the “happiness” in those Tweets required the use of the Language Assessment by Mechanical Turk (LabMT) word list, “assembled by combining the 5,000 most frequent words occurring in each of four text sources: Google Books (English), music lyrics, the New York Times and Twitter.”
Those individual words “have been scored by users of Amazon’s Mechanical Turk service on a scale of 1 (sad) to 9 (happy), resulting in a measure of average happiness for each given word,” the study added. For example, “rainbow” would score high on the happiness scale, with an 8.1, while “earthquake” would be deep in sadness territory, with a piddling 1.9 score. The researchers made no attempt to “take the context of words or the meaning of a text into account,” something they admitted could lead to “difficulties in accurately the emotional content of small texts.”
With their mathematical formulas and underlying data in place, the researchers went to work. According to the study, the five happiest states include Hawaii, Maine, Nevada, Utah and Vermont; the five saddest are Louisiana, Mississippi, Maryland, Delaware and Georgia. In general, the West and Northeast seemed much happier than the Mid-Atlantic and South—with the exception of Florida, which shaded “happier” than many of the surrounding states.
The happiest urban areas include Napa, CA., Longmont, CO., San Clemente, CA., Santa Fe, NM., and Santa Cruz, CA. The list of saddest cities is topped by Beaumont, TX., followed by Albany, GA., Texas City, TX., Shreveport, LA., and Monroe, LA.
Researchers also found that those cities with the highest levels of happiness—such as Napa and San Clemente—tended to produce more Tweets per capita, while the saddest Tweeted less.
The researchers admitted their study’s limitations. “There are a number of legitimate concerns to be raised about how well the Twitter data set can be said to represent the happiness of a greater population,” they mentioned near the end of the study. “Only 15 [percent] of online adults regularly use Twitter, and 18-29 year-olds and minorities tend to be more highly represented on Twitter than in the general population.” The study also collected a mere subset of all the Tweets out there, meaning that the data is ultimately “a non-uniform subsample of statements made by a non-representative portion of the population.”
Nonetheless, the researchers stood by Twitter as a tool for broader demographic research, arguing that it could come in handy for anyone attempting to study a society’s economic and social makeup. Certainly there are a lot of opportunities for refining the model—for example, if Hawaii’s status as a vacation state affects its rate of “happy” Tweets, or if incorporating languages other than English into the dataset would affect the ultimate results.
Image: “The Geography of Happiness” study