Jumping NLP Curves: A review of Natural language processing research

In the current internet age, the civilization has undergone rapid changes and NLP research has produced multiple great things related to artificial intelligence e.g., Google, IBM’s Watson, Apple’s Siri etc.

In this blog, we will discuss the evolution of NLP research and present them in the form of the intersection of three overlapping curves namely Syntactic, Semantics and Pragmatics Curves.

Poising on the Syntactic Curve(Bag of Words):

Syntax centered NLP is still used very popularly to manage different tasks like information retrieval and extraction, topic modeling, auto-categorization etc. It is broadly grouped into three main categories: keyword spotting, lexical and statistical methods.

Keyword Spotting is the most popular approach due to its cost-effectiveness. Keyword spotting we can use for text classification. Some of the most popular project on keyword spotting includes:

(a) Ortony’s Affective Lexicon: it groups words into effective categories

(b) Penn Treebank: a corpus consisting of over 4.5 million words of American English annotated for part-of-speech (POS) information

(d) LexRank: a stochastic graph-based method for computing relative importance of textual units for NLP

(e)TextRank: a graph-based ranking model for text processing, based on two unsupervised methods for keyword and sentence extraction.

Lexical Affinity is a slightly cleverer mechanism than keyword spotting as, rather than just detecting obvious words, it assigns to arbitrary words a probabilistic ‘affinity’ for a particular category.

e.g. ‘accident’ can be considered as a 75% probability of indicating a negative event and these probabilities are calculated from linguistic corpora. This approach performs better than keyword spotting but there are multiple problems with it. Let us say if we will consider a sentence “I met an accident ” is indicating negative probability. But if I will say “I met my girlfriend by accident”(unplanned or lovely surprise). Another one problem with lexical affinity is it is biased towards the class of the text, which is making it difficult for reusability and creating the domain-independent model.

Statistical NLP uses language models based on different popular machine learning algorithms such as maximum-likelihood, expectation maximization, support vector machines, conditional random fields. Statistical models are semantically weak so it works with acceptable accuracy when we provide large text input.

Surfing the Semantics Curve (Bag of Concepts):

Semantics-based NLP focuses on the meaning associated with the text, rather than just processing the documents with syntax level.Semantics-based NLP approaches can be broadly grouped into two main categories: techniques that leverage on external knowledge, e.g., ontologies (taxonomic NLP) or semantic knowledge bases (noetic NLP), and methods that exploit only intrinsic semantics of documents (endogenous NLP).

Taxonomic NLP includes initiatives that aim to build universal taxonomies or Web ontologies for grasping the subsumptive or hierarchical semantics associated with natural language expressions.In general, it attempts to build taxonomic resources are countless and include both resources crafted by human experts or community efforts such as WordNet and Freebase and automatically built knowledge bases. Examples of such knowledge bases include:

(a) WikiTaxonomy: a taxonomy extracted from Wikipedia’s category links.

(b) YAGO: a semantic knowledge base derived from WordNet, Wikipedia, and GeoNames

(c) NELL(Never-Ending Language Learning), a semantic machine-learning system that is acquiring knowledge from the Web every day

(d) : A research prototype that aims to build a unified taxonomy of worldly facts from

1.68 billion web pages in Bing repository.

Noetic NLP embraces all the mind inspired approaches to NLP that attempt to compensate for the lack of domain adaptivity and implicit semantic feature inference of traditional algorithms, e.g., first principles modeling or explicit statistical modeling. Noetic NLP differs from taxonomic NLP in which it does not focus on encoding subsumption knowledge but rather attempts to collect idiosyncratic knowledge about objects, actions, events. Noetic NLP, moreover, performs reasoning in an adaptive and dynamic way, e.g., by generating context-dependent results or by discovering new semantic patterns that are not explicitly encoded in the knowledge base.

Foreseeing the Pragmatics Curve(Bag of Narratives):

Narrative understanding and generation are central for reasoning, decision-making, and ‘sensemaking’. Besides being a key part of human-to-human communication, narratives are the means by which reality is constructed and planning is conducted. Decoding how narratives are generated and processed by the human brain might eventually lead us to truly understand and explain human intelligence and consciousness. Computational modeling is a powerful and effective way to investigate narrative understanding. A lot of the cognitive processes that lead humans to

understand or generate narratives have traditionally been of interest to AI researchers under the umbrella of knowledge representation, common-sense reasoning, social cognition, learning, and NLP.There are already a few pioneering works that attempt to understand narratives by leveraging on discourse structure argument-support hierarchies, plan graphs, and common-sense reasoning.

Conclusion:

Word and concept level approaches to NLP are just a first step towards natural language understanding. The future of NLP lies in biologically and linguistically motivated computational paradigms that enable narrative understanding and hence, ‘sensemaking’. Computational intelligence potentially has a large future possibility to play an important role in

NLP research.

References:

[1] Erik Cambria, Bebo White, “Jumping NLP Curves: A Review of Natural

Language Processing Research”

[2] L. Araujo, “Symbiosis of evolutionary techniques and statistical natural language processing,” IEEE Trans. Evol. Comput., vol. 8, no. 1, pp. 14–27, 2004.

[3] N. Asher and A. Lascarides, Logics of Conversation. Cambridge, U.K.: Cambridge Univ. Press, 2003.

Word embeddings and an application in SMT

We all are aware of (not so) recent advancements in word representation, such as Word2Vec, GloVe etc. for various NLP tasks. Let's try to dig a little deeper of how they work, and why they are so helpful! The basics, what is a Word vector? We need a mathematical way of representing words so as to process them. We call this representation, a word vector. This representation can be as simple as a one-hot encoded vector having the size of the vocabulary. For ex, if we had 3 words in our vocabulary {man, woman, child}, we can generate word vectors in the following manner Man : {0, 0, 1} Woman : {0, 1, 0} Child : {1, 0, 0} Such an encoding cannot be used to for any meaningful comparisons, other than checking for equality. In vectors such as Word2Vec, a word is represented as a distribution over some dimensions. Each word is assigned some particular weight for each of the dimensions. Picking up the previous example, this time the vectors can be as following (assuming a 2 dime...

Amalgam

Search This Blog

Jumping NLP Curves: A review of Natural language processing research

Comments

Post a Comment

Popular posts from this blog

NLP in Video Games

Word embeddings and an application in SMT

Discourse Analysis