Detecting sarcasm is among one of the toughest natural language understanding problems in AI. In computational linguistics and NLP, sarcasm detection is receiving increasing research interest. While recent studies recognized the linkage between sarcasm and sentiment and have proposed various techniques for detecting sarcasm, none directly and systematically studied the impact of sarcasm detection on sentiment analysis.
Crystalace is a sarcasm detection method developed by researchers in the Institute of High Performance Computing, A*STAR.
This site shows an interactive live demo of our method, introduces the science behind it and describes a few key use cases for applying the method. It also provides downloadable resources for the research community.
This work is described in a recent paper to appear in the 11th International Workshop on Semantic Evaluation (SemEval 2017).
Sarcasm is a complex communication phenomenon. It is often expressed in a seemingly positive way in the literal sense which involves a negative emotional connotation.
Sentiment analysis, also known as opinion mining, is a popular topic of study of the feelings and opinions from social media user-generated content. Sarcasm detection, though very related, is a different topic of interest to sentiment analysis. As a classification task, the primary objective of sentiment analysis is to determine if a message is positive, negative, or neutral. In contrast, the objective of sarcasm detection is to determine if a message is sarcastic or not sarcastic.
To illustrate, let us look at two short text examples.
Example 1. Love my new phone! Only that the battery runs out very fast.
Example 2. Love my new phone that runs out battery so fast!
Failure to recognize sarcasm may lead to miscommunication (see examples of misinterpreted sarcastic tweets). For social media analytics and communication, the associated risk can be amplified due to the sheer volume and velocity of potentially sarcastic expressions falsely considered as positive expressions.
Our Innovation
In order to capture discriminative and explainable sarcasm features, we sought to design a feature model based on review and synthesis across related studies such as natural language processing, linguistics, psychology, speech and communication, as well as neuroscience.
The figure below presents an overview of the proposed sarcasm detection method that we name it as "Crystalace".
To train and evaluate our sarcasm classifier, we downloaded the annotated tweets dataset from Riloff et al. (2013), pre-processed the tweets, and trained a linear SVM classifier using our features model. The results show that our method obtained F1-score of .60, which gained an additional .09 as compared to the best condition reported in Riloff et al.'s original study. Based on the results, we trained the final Crystalace sarcasm classifier using the full dataset.
To date, Crystalace API's classification accuracy, based on F1-score evaluating against human annotations as the ground truth over 10-fold cross validations, had arrived at 0.66 on detecting sarcastic expressions from non-sarcastic expressions.
For enquiry and/or collaboration opportunities, you may contact Dr. Yang Yinping.
Enhancing Sentiment Analysis
To the best of our knowledge, no major sentiment analysis systems developed to date have incorporated the capability to recognize sarcasm. In our latest research, we designed a sarcasm detection enhanced sentiment analysis system that we call it "CrystalNest" and evaluated its performance. The results using official SemEval-2017 Task 4A-4D test data provided evidence on the value of embedding sarcasm detection in sentiment analysis systems.
Detecting Sarcastic Five Star Reviews
Many online view sites provide a function to allow users to give star ratings on a product, a service or an employer. Sometimes, however, people do not follow these ratings unintentionally and intentionally. We applied Crystalace to analyze 15 Amazon product reviews and detected eight sarcastic reviews that were actually marked as four or five stars.
World "Cyber Sarcasm" Profile
How would users leverage sarcasm detection for business cases? Let's start to explore, say, which countries are most sarcastic? We depicted a word sarcasm profile based on a collection of tweets.
A very recent knowledge we know about sarcasm is that it is highly associated with creativity. In a study published in Organizational Behavior and Human Decision Processes, researchers tested a novel theoretical model in which both the construction and interpretation of sarcasm lead to greater creativity following a simulated sarcastic conversation or after recalling a sarcastic exchange. They found that both sarcasm expressers and recipients reported more conflict but also demonstrated enhanced creativity, and these are because they activate abstract thinking. Therefore, it would be very interesting to extend this simple world sarcasm profile analysis
to a sociolinguistic study.
Emotion Intensity (EI) Lexicon
The Emotion Intensity (EI) Lexicon is a tab-delimited list of 3,204 emotion-related English words, common emoticons and Internet slangs labelled in two dimensions: strength and intensity. The lexicon is built with a general emotion-feature extraction purpose, and hence could be useful for other NLP tasks or behavior prediction research.
Complete a simple registration to receive a link to download the lexicon.
Tweet Pre-Processing (TweetCrystalizer) Script
Social media content such as tweets contain unlimited amount of untraditional expressions, such as user-created hashtags (e.g., #shitnooneeversay), misspelt or elongated words (e.g., greaaat, awwww), and unusual expressions or Internet slangs (e.g., lolz, SMDH). This makes it difficult for direct processing. We developed a Tweet Pre-Processing (TweetCrystalizer) script that can pre-process a tweet to a normalized text. This module is found to be helpful in enhancing the efficacy of subsequent analysis.
Complete a simple registration to receive a link to download the Python script.