Amalgam

Posts

Showing posts with the label natural language processing

Sentiment Analysis of Hi-En Code-Mixed Data

Introduction We have seen an exponential growth in the volume of users on Online Social Networks (OSN) in the Indian subcontinent over the past few years. This has prompted the attention of several stakeholders and first responders to turn to OSN to make decisions and plan their next move. In multilingual societies such as India, it's ubiquitous to find large volumes of code-mixed online discourses, tweets, and posts. The process of performing Language and Text Analysis on any such data is not a trivial task as all of the traditional tools for NLP are based on English and do not work well for code-mixed data. In this blog post, I would be specifically talking about Hindi-English (Hi-En) Code-Mixed data (however most of the concepts apply to other forms of code-mixing as well.) What is Code-Mixing? Code Mixing is a natural phenomenon of embedding linguistic units such as phrases, words or morphemes of one language into an utterance of another (Muysken, 2000; Duran, 199...

Programming Language Naturalization

Have u ever thought of a programming language which can be written in natural language. We came across different kind of applications which need graphs to be plotted and required data has to be stored and some complex actions have to be performed using Internet of things or on any other data. The above requirements can be accomplished using a programming Language which has to be written precisely following all the rules. On the other hand, there is a method which can convert natural language into formal language. This can be done using semantic parsing. The ability of this parsing is limited and not as powerful as implementation through programming. Example for this is “Voxelurn”. Example for natural language programming This concept is called “ naturalization ”. This bridges gap between natural language and core language. In any application development, we need to select a core language and we need to train the system with rules (conver...

Word Sense Disambiguation

Word Sense Disambiguation(WSD) is the ability to identify best sense of a word in a particular context, when the word has multiple meanings. It can be considered as a classification problem: Given a word and it's meanings(senses), classify the word in one of it's sense class based on evidence from the context and external knowledge sources. For example, consider the following two sentences: a) The workers at the plant were overworked. b) The gardener was watering the plant. In first sentence, the word 'plant' refers to the industrial plant whereas in second one, it refers to a tree. Word Sense Disambiguation System. WSD is an important part of many applications such as Machine Translation , Information Retrieval , Information Extraction , Content Analysis , Word Processing (Spelling Correction) , Semantic Web etc . It can help in improving the relevance of search engines, anaphora resolution , coherence, inference etc. ...

Making the world a better place with Chatbots

We have ushered into an era where technology is making its way into every aspect of our lives. Fields of Artificial Intelligence, Machine Learning, Natural Language Processing, Computer Vision, etc. are being extensively studied and applied to make our lives easier and more convenient. We now have the popular Siri, Cortana and Google's Voice Assistant. What started as simple chatbots for basic speech recognition have paved the way for much more sophisticated voice assistants that can now do almost anything for you, from answering basic questions that you have to schedule meetings. They can even pick up upcoming trips from your email and notify you the status of your flight, courtesy learning techniques and algorithms that are improving by the day. Various websites now have chatbots that provide customer service and assistance. It is thus clear that there is a growing understanding and need of chatbots to be integrated into our day-to-day operations. Amazon's Lex is taking t...