Detecting sarcasm is an essential skill in understanding the people around you. So much of what we see on the internet is sarcastic and joking. Humans can instinctively pick up on saracasm in a social media post and make a mental note that it isn’t serious. But how do serious comments get sorted from sarcastic ones by something that doesn’t understand its nuances, like a computer?
Think about Sheldon Cooper in the Big Bang Theory. He is a brilliant astrophysicist but he can’t understand simple social cues. A running joke in the show is that he often doesn’t understand what’s going on around him because people are being sarcastic, he takes them too literally, and then has to be told the comment was sarcastic so he can react appropriately. Sheldon Cooper acts much like a computer does. It is brilliant at understanding literal information but doesn’t understand a joke.
For class a couple of weeks ago we were assigned to watch Del Harvey’s about protecting Twitter users from themselves and others. She mentioned an example about a tweet with the text “yo bitch” that had a picture of a dog attached. Without the picture this tweet is fairly aggressive if you don’t understand culturally relevant dialogue and references. But with the picture Del Harvey understands it is a joke. What she doesn’t mention, however, is who or what is analyzing this single tweet. Harvey can’t personally look over every flagged tweet because as she mentioned, there are over 500 million tweets a day.
What social media scanning is already happening?
Facebook monitors user chats and posts for predatory or illegal behaviors. It has turned in many possible suspects to law enforcement officials. One predator was a 30 year old man inboxing a 13 year old in South Florida about sex and planned to pick her up from middle school the following day. The conversation was automatically flagged and Facebook employees looked at it and contacted authorities to arrest the man. Facebook uses software to scan conversations for certain criteria and behaviors.
- The users aren’t friends, only recently became friends, or have no mutual friends
- They interact with each other very little
- They have a significant age difference
- They are located far from each other
Along with this criteria they look for phrases used by past criminals. A Facebook employee will not actually look at a post or message unless some of these things occur.
The Super Bowl
The Department of Homeland Security classified the 2015 Super Bowl in Phoenix, AZ as a Level 1 security event, only behind events like the U.N. General Assembly. With over 60,000 people traveling to the event and the general threat of bombings and shootings in large gatherings that plagues our world today, The Secret Service reported that they would be monitoring Twitter and Facebook for threats. At the time, a Service Service spokesperson told the Washington Post that screening for sarcasm was “just one of 16 or 18 things we are looking at.” But, they were unable to acquire software that would help detect for sarcasm and pick out fake threats. Strategies like this were also used during the Boston Marathon following 2013
And so, a sarcastic machine was developed!
Rossano Schifanella, a computer science professor at University of Turin, along with colleagues from Yahoo!, worked to teach computers about sarcasm. The team did a research study involving English speaking participants where they had to mark social media posts as sarcastic or not. First, just text posts and then text that also had images.What they found was that images, linguistic cues and word play (“I just loooooove snow” versus “I just love snow”), and punctuation (especially exclamation points) showed sarcasm the most. They then created a mathematical algorithm based on what they learned from humans that detected sarcasm accurately 80 to 89 percent of the time. That’s pretty good for a machine.
What sort of opportunities does this present ?
Think about marketers that reveal a new product. Often, they turn to data analytics of social media posts to understand its success. How many times is the product mentioned positively? How many times is it mentioned negatively? The numbers could be skewed by sarcasm.
I think the biggest and most widespread sarcastic campaign I’ve seen online is the #ThanksObama. If you’re wondering what it is Urban Dictonary can tell you:
There must be hundreds of tweets a day that mention this hashtag and not all of them can be counted in as praise. Of course we humans know this but a normal computer wouldn’t. A simple analytical tool might say based on Twitter activity that Obama has a high approval rating.
Rossano Schifanella’s algorithm could help reduce wasted time and provide more accurate data analytics. The content on the internet is only increasing exponentially so more sophisticated formulas need to be developed that align with human behavior.
Take that Siri.