Skip to main content

Word Sense Disambiguation


Word Sense Disambiguation(WSD) is the ability to identify best sense of a word in a particular context, when the word has multiple meanings. It can be considered as a classification problem: Given a word and it's meanings(senses), classify the word in one of it's sense class based on evidence from the context and external knowledge sources.
For example, consider the following two sentences:
a) The workers at the plant were overworked.
b) The gardener was watering the plant.

In first sentence, the word 'plant' refers to the industrial plant whereas in second one, it refers to a tree.




  Word Sense Disambiguation System.



WSD is an important part of  many applications such as Machine Translation, Information Retrieval, Information Extraction, Content Analysis, Word Processing(Spelling Correction) , Semantic Web etc. It can help in improving the relevance of search engines, anaphora resolution, coherence, inference etc. 

There are mainly two types of  Word Sense Disambiguation, namely :

  • Lexical Sample(or targeted WSD) - In this, the system is required to disambiguate only a set of targeted words, usually one per sentence.
  • All-words WSD - In this, the system needs to disambiguate all words in a text.
Word Sense Disambiguation has four major tasks, namely:
  • Selection of Word Senses - It is the process of identifying most appropriate sense of a word in a particular context.It is the key problem in WSD.
  • External Knowledge Sources - Knowledge resources are fundamental part of WSD. They provide the knowledge required to map a word to its appropriate senses. They vary from labelled or unlabeled corpora of text to machine-readable dictionaries,Thesauri, ontologies etc.
  • Representation Of Context - In this ,the text is converted into a structured format so that it can be given as input to an automatic method. For this preprocessing  is done which includes steps such as tokenization, POS tagging, lemmatization, chunking and parsing.
  • Choice Of a Classification Method- This is the final step of WSD. There are many approaches to resolve ambiguities which are explained below.
Various Approaches to Word Sense Disambiguation are:
  • Supervised WSD - In this approach, Machine Learning Techniques are used to learn classifier to classify words into their appropriate senses with the help of labelled training sets.
  • Unsupervised WSD - In this approach, unlabeled corpora is used to map senses to word. 
 These approaches can further be distinguished as knowledge-based and corpus-based . The former makes use of machine readable dictionaries, ontologies, thesauri whereas the latter makes use  of unlabeled corpora for disambiguation.
 Another way to categorize WSD approaches are token-based and type-based. In token-based  approach, each word is associated with a specific meaning according to the context in which it  appears whereas in type-based approach, it is assumed that the word has same sense within a single  text.   

List of WSD Algorithms


Supervised Disambiguation Techniques are:
  • Decision Lists - It is an ordered set of rules for assigning an appropriate sense to a target word.The rules are ordered based on their decreasing score.It can be considered as a list of weighted if-then-else rules.
  • Naive Bayes- It is a classification technique based on the Bayes' theorem. It predicts the sense associated with a word with the help of the conditional probability of each sense Si of a word w given the features fj in the context.The sense S with maximum probability is chosen as most appropriate sense in context.
  • Neural Networks -   It is an interconnected group of artificial neurons that uses a computational model for processing data. The inputs to neural network are pairs of input feature and desired response.The weights are progressively adjusted so that the desired response has higher activation than any other output unit.Training is done till the desired output has higher activation than any other output unit.
Unsupervised Disambiguation Techniques aims at identifying sense clusters rather than assigning sense labels.Some of them are:
  • Context Clustering - Each occurrence of a target word in a corpus is represented as a context vector.These vectors are then clustered into groups using clustering algorithms, each identifying a sense of the target word.Clustering is done based on contextual similarity between occurrences.
  • Word Clustering - This method aims at clustering words which are semantically similar(synonyms). 
  • Coocurrence graph - It is a graph based approach in which vertices V corresponds to words in a text and edges E connects pair of words which cooccur in a syntactic relation, in the same paragraph or in larger context.Each edge is assigned weight based on the relative coocurrence frequency of two words connected by an edge.Weights to edges corresponding to most  coocurrent  words are assigned 0 and for least coocurrent words,it is assigned as 1.The edges having weights above a certain threshold is discarded.  
Knowledge Based Disambiguation Techniques are:
  • Lesk Algorithm-It is based on the assumption that each word in a sentence is based on a similar topic.In this,all possible definitions of each word is taken and the definition which overlaps the most is taken as the appropriate one.
  • Selectional Preferences- Selectional preference denotes a word's tendency to co-occur with words that belong to certain lexical sets.In this method, selectional preference is used to restrict the number of meanings of target words in a context.It discards the senses that violate the constraints and prefer those senses which satisfies the requirements.  
 There are many challenges to Word Sense Disambiguation, some of them are:
  • Different Algorithms for Different Applications- Different Algorithms are required for different Applications.For Example, in Machine Translation,exact sense of a word is required whereas in Information Retrieval,only confirmation that the sense of a word in  query and retrieved documents are same is required,not the exact sense of word.
  • Representation of Word Senses- The choice to represent word senses and how to divide senses is a fundamental problem in WSD. Ever-changing nature of senses further poses problem in its representation.
  • Knowledge Acquisition Bottleneck- The problem of manual creation of knowledge and changing it whenever disambiguation scenario changes  is known as knowledge acquisition bottleneck.It is one of the major problem in WSD as it relies heavily on knowledge.
  • Task Dependent Sense Inventory - Sense inventory mechanism is task-dependent.Each task requires its own division of word meaning into senses relevant to task. 

References:
  • Navigli, Roberto. "Word sense disambiguation: A survey." ACM Computing Surveys (CSUR) 41.2 (2009): 10.
  • https://blog.recast.ai/understanding-word-understand-language/
  • https://en.wikipedia.org/wiki/Word-sense_disambiguation
  • http://www.scholarpedia.org/article/Word_sense_disambiguation





    



Comments

Popular posts from this blog

NLP in Video Games

From the last few decades, NLP (Natural Language Processing) has obtained a high level of success in the field  of Computer Science, Artificial Intelligence and Computational Logistics. NLP can also be used in video games, in fact, it is very interesting to use NLP in video games, as we can see games like Serious Games includes Communication aspects. In video games, the communication includes linguistic information that is passed either through spoken content or written content. Now the question is why and where can we use NLP in video games?  There are some games that are related to pedagogy or teaching (Serious Games). So, NLP can be used in these games to achieve these objectives in the real sense. In other games, one can use the speech control using NLP so that the player can play the game by concentrating only on visuals rather on I/O. These things at last increases the realism of the game. Hence, this is the reason for using NLP in games.  We ...

Word embeddings and an application in SMT

We all are aware of (not so) recent advancements in word representation, such as Word2Vec, GloVe etc. for various NLP tasks. Let's try to dig a little deeper of how they work, and why they are so helpful! The basics, what is a Word vector? We need a mathematical way of representing words so as to process them. We call this representation, a word vector. This representation can be as simple as a one-hot encoded vector having the size of the vocabulary.  For ex, if we had 3 words in our vocabulary {man, woman, child}, we can generate word vectors in the following manner Man : {0, 0, 1} Woman : {0, 1, 0} Child : {1, 0, 0} Such an encoding cannot be used to for any meaningful comparisons, other than checking for equality. In vectors such as Word2Vec, a word is represented as a distribution over some dimensions. Each word is assigned some particular weight for each of the dimensions. Picking up the previous example, this time the vectors can be as following (assuming a 2 dime...

Discourse Analysis

NLP makes machine to understand human language but we are facing issues like word ambiguity, sarcastic sentiments analysis and many more. One of the issue is to predict correctly relation between words like " Patrick went to the club on last Friday. He met Richard ." Here, ' He' refers to 'Patrick'. This kind of issue makes Discourse analysis one of the important applications of Natural Language Processing. What is Discourse Analysis ? The word discourse in linguistic terms means language in use. Discourse analysis may be defined as the process of performing text or language analysis, which involves text interpretation and knowing the social interactions. Discourse analysis may involve dealing with morphemes, n-grams, tenses, verbal aspects, page layouts, and so on. It is often used to refer to the analysis of conversations or verbal discourse. It is useful for performing tasks, like A naphora Resolution (AR) , Named Entity Recognition (NE...