Skip to main content

Identification of Sarcasm in Tweets


Sarcasm means expressing our feelings in opposite of what we actual feel. It can also be defined as a satirical wit intending to insult, mock, or amuse but it is to be removed during natural language processing. In the usage of Twitter, we observed that many sarcastic tweets have a common structure that creates a positive/negative contrast between a sentiment and a situation. Specifically, sarcastic tweets often express a positive sentiment in reference to a negative activity or state. Let’s consider the tweets below, where the positive sentiment terms are underlined and the negative activity/state terms are italicized.
(a) Wow! I feel happy when he denied my payment.
(b) Oh how I love being ignored.
(c) Absolutely adore it when my bus is late.
(d) I’m so pleased mom woke me up with vacuuming my room this morning.
The sarcasm in these tweets arises when a positive sentiment word (e.g., love, adore, pleased) with a negative activity (e.g., denied my payment, being ignored, bus is late, denied my payment).

The goal is to identify sarcasm that arises from the differences between positive sentiments refers to a negative situation. A key issue is to automatically recognize the monotonous negative “situations”, that are activities, states that most people feel sorry to be unenjoyable or undesirable. These situations are recognized as being negative, so they are rarely accompanied by a particular negative feeling. For example, “I feel sick” is globally understood to be a negative situation. So such recognized phrases that correspond to negative situations must be learnt.

Bootstrapping Algorithm

A bootstrapping algorithm proceeds to learn terms or phrases of positive sentiments and negative situations automatically. The aim of algorithm is to generate a sarcasm classifier for tweets to recognize contexts having a positive sentiment contrasted with a negative situation.

Learning Negative Situation Phrases


The initial phase of bootstrapping method learns new phrases that correspond to negative situations. The learning process consists of two steps: (1) cropping candidate phrases and (2) selecting the suitable candidates. We can collect the phrases for negative situations and extract N-grams that follow a positive sentiment phrase in a sarcastic tweet. We pick every one gram, two gram and three gram that occurs immediately at the right side of a positive sentiment phrase.


I am very happy when he denied my payment # sarcasm


In the above statement, where “happy” is the positive sentiment: I am very happy when he denied my payment # sarcasm. In this example, we extract three N-grams for candidate negative situation phrases can be extracted such as happy, very happy, very happy when. Then based on the part-of-speech (POS), filter the list to keep N-grams for the intended syntactic structure. For negative situation, the goal is to learn the verb phrase (VP) complements that are themselves verb phrases. So we require a candidate phrase to be either a one-gram as a verb (V) or the phrase matches the one of 9 POS-based bigram patterns that is created to try to approximate the recognition of verbal complement structures. 

Learning of Positive Verb Phrases

Learning positive sentiment phrases is comparable in certain aspects. First, collection phrases that are capable to convey a positive sentiment by obtaining N-grams that come before a negative situation phrase in a sarcastic tweet. To learn positive sentiment verb phrases, we pick every One-gram and Two-gram that occurs immediately before on the left side of a negative situation phrase.

Learning Positive Predicative Phrases

The negative situation phrases are used to eliminate predicative expression that occur close proximity. Based on the same assumption that sarcasm often claims from the difference between positive sentiments and a negative situation and tweets are targeted that has a negative situation and a predicative expression nearby. Assuming that, the predicative expression conveys a positive sentiment. We pick positive sentiment candidates by extracting one-grams, two-grams and three-grams that appear immediately after a verb and occur within five words of the negative situation phrase, on either side. This restriction only compels proximity because predicative expressions often appear in a separate clause or sentence. For example, “It is just great that my data was stolen” or “My data was stolen. This is great.”

References

1. Henry S. Cheang and Marc D. Pell. 2009. Acoustic markers of sarcasm in cantonese and english. The Journal of the Acoustical Society of America, 126(3):1394–1405.

2. Dmitry Davidov, Oren Tsur, and Ari Rappoport. 2010. Semi-supervised recognition of sarcastic sentences in twitter and amazon. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning, CoNLL 2010.

3. Sarcasm as Contrast between a Positive Sentiment and Negative Situation Ellen Riloff, Ashequl Qadir, Prafulla Surve, Lalindra De Silva, Nathan Gilbert, Ruihong Huang

4. Roger Kreuz and Gina Caucci. 2007. Lexical influences on the perception of sarcasm. In Proceedings of the Workshop on Computational Approaches to Figurative Language.

5. Christine Liebrecht, Florian Kunneman, and Antal Van den Bosch. 2013. The perfect solution for detecting sarcasm in tweets #not. In Proceedings of the 4th
Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, WASSA 2013.

Comments

Popular posts from this blog

NLP in Video Games

From the last few decades, NLP (Natural Language Processing) has obtained a high level of success in the field  of Computer Science, Artificial Intelligence and Computational Logistics. NLP can also be used in video games, in fact, it is very interesting to use NLP in video games, as we can see games like Serious Games includes Communication aspects. In video games, the communication includes linguistic information that is passed either through spoken content or written content. Now the question is why and where can we use NLP in video games?  There are some games that are related to pedagogy or teaching (Serious Games). So, NLP can be used in these games to achieve these objectives in the real sense. In other games, one can use the speech control using NLP so that the player can play the game by concentrating only on visuals rather on I/O. These things at last increases the realism of the game. Hence, this is the reason for using NLP in games.  We ...

Discourse Analysis

NLP makes machine to understand human language but we are facing issues like word ambiguity, sarcastic sentiments analysis and many more. One of the issue is to predict correctly relation between words like " Patrick went to the club on last Friday. He met Richard ." Here, ' He' refers to 'Patrick'. This kind of issue makes Discourse analysis one of the important applications of Natural Language Processing. What is Discourse Analysis ? The word discourse in linguistic terms means language in use. Discourse analysis may be defined as the process of performing text or language analysis, which involves text interpretation and knowing the social interactions. Discourse analysis may involve dealing with morphemes, n-grams, tenses, verbal aspects, page layouts, and so on. It is often used to refer to the analysis of conversations or verbal discourse. It is useful for performing tasks, like A naphora Resolution (AR) , Named Entity Recognition (NE...

Dbpedia Datasets

WHAT IS Dbpedia? It is a project idea aiming to extract structured content from the information created in the wikipedia project. This structured information is made available on the World Wide Web. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datsets. BUT? But why i am talking about Dbpedia ? How it is related to natural language processing? The DBpedia data set contains 4.58 million entities, out of which 4.22 million are classified in a consistent ontology, including 1,445,000 persons, 735,000 places, 123,000 music albums, 87,000 films, 19,000 video games, 241,000 organizations, 251,000 species and 6,000 diseases. The data set features labels and abstracts for these entities in up to 125 languages; 25.2 million links to images and 29.8 million links to external web pages. In addition, it contains around 50 million links...