The use of language has always been held up as one of the things that separates Homo sapiens from other animals.  Although we now know that using sound to communicate is not a uniquely human activity, the complexity and expressive range of human language is definitely unusual.  What prompted the development of language is not clear.  In part, language is a mechanism for conveying and organizing information: here is where the big animals are, and here is how we can hunt them.  But language also serves a social function, in a very social species; humans seek to understand and explain the world by telling stories, and people everywhere are inveterate gossips and chatterboxes.

Wired has an article on the “Wired Science” blog that described some interesting new research that may shed a bit of light on this second, social function of language. Researchers from the University of Vermont and Cornell University attempted to measure the emotional content of language; from the abstract [full PDF available]:

Within the last million years, human language has emerged and evolved as a fundamental instrument of social communication and semiotic representation. People use language in part to convey emotional information, leading to the central and contingent questions: (1) What is the emotional spectrum of natural language? and (2) Are natural languages neutrally, positively, or negatively biased?

There have been past attempts to answer these questions via psychology experiments, with somewhat mixed results.  The new work is interesting because it was conducted by mathematicians, led by the University of Vermont’s Isabel Klouman, who took a different, statistical approach.  They assembled four large bodies of English text, taken from different sources:

  • 3.29 million Google Books, containing 361 billion words
  • 821 million tweets, from 2008 through 2010, containing 9 billion words
  • 1.8 million New York Times articles, from 1987 to 2007, containing 1 billion words
  • Lyrics from 295,000 popular songs, containing 58.6 million words

The team compiled a list of the 5,000 most common words in each corpus, and then combined these lists to get a final list of 10,122 common words.  Then, for each word, they got ratings from 50 different people (using Amazon’s “Mechanical Turk” service), rating the words on an emotional content scale, ranging from 1 (extremely negative) to 9 (extremely positive).  In all four samples, words with positive emotional connotations significantly outnumbered words with negative connotations; furthermore, the positive words were more frequently used.

The findings “suggest that a positivity bias is universal,” wrote Klouman and colleagues. “In our stories and writings we tend toward pro-social communication.”

The implications of this bias are not entirely clear, and it remains to be seen whether the results are similar in other languages.  Still, our use of language certainly contains some clues to what we are thinking; as Yogi Berra reportedly said, sometimes “you can observe a lot by just watching” — or, in this case, listening.

