Skip to main content

Math Reasoning using NLP Techniques

The recent trends in NLP has shown great interest in understanding text to perform various tasks. Understanding text to perform mathematical reasoning have focused on automatically solving school level math word problems. Advancement in this area has great potential to be used as automatic tutoring service for school students.
This blog focuses on a web based tool named ILLINOIS MATH SOLVER that supports performing mathematical reasoning [1]. The solver can answer a wide range of mathematics questions, ranging from operation questions like “What is the result when 6 is divided by the sum of 7 and 5 ?” to elementary school level math word problems, like “I bought 6 apples. I ate 3 of them. How many do I have left ?”. ILLINOIS MATH SOLVER provides an easy way to test the robustness of the system, and a tool for crowd based data acquisition.
Fig. Screenshot of Illinois Math Solver

Working Description
The whole system comprises of two different modules, firstly a Context Free Grammar(CFG) based semantic parser to handle queries for operation between numbers (addition, difference, fraction). The parser creates a list of number and mathematical terms using some derivation rules. For example, if the question is "What is the result when 15 is multiplied to difference of 5 and 12 ?", it first creates a list as {15,multiplied, difference,5,12}. Here, the word "multiplied" helps to parse an expression into multiple sub-expressions. 26 such derivation rules are used and Cocke-Younger-Kasami(CYK) algorithm is used for parsing. Secondly, a Arithmetic Problem Solver [2] is used to handle arithmetic problems with multiple steps and operations which decomposes an input arithmetic problem into several decision problems, and learns predictors for these decision problems resulting in generation of a binary expression tree for the solution mathematical expression.
An example of arithmetic word problem with its solution and expression tree
A classifier is learnt to predict a math operation along with its order. It finds the lowest common ancestor (LCA) node in the expression tree. In the above figure, this multi-class classifier task is to first perform the addition and then the multiplication operation. Also, number "2" in the above figure is irrelevant for the solution. Another classifier is trained to predict such irrelevant quantities in the problem.

Evaluation
The system was evaluated on union of three datasets (addition subtraction problems from AI2 dataset (AI2) [3], single operation problems from Illinois dataset (IL) [4] and multi-step problems from commoncore dataset (CC) [2]) and found to achieve state-of-the-art performance on all these datasets.

Schematic Diagram of Illinois Math Solver

Limitations
Currently the system works by combining the numbers mentioned in the next, but is unable to perform its task for strings like "1 day" as the prior knowledge of 1 day -> 24 hours is required for the same. It is also unable to handle algebra word problems involving multiple equations with one or more variables.

References
[1] Subhro Roy and Dan Roth. 2016 "Illinois Math Solver: Math Reasoning on the Web"  Proceedings of NAACL-HLT (Demonstrations), Association for Computational Linguistics.
[2] Subhro Roy and Dan Roth. 2015. "Solving general arithmetic word problems" In Proc. of the Conference on Empirical Methods in Natural Language Processing (EMNLP)
[3] M. J. Hosseini, H. Hajishirzi, O. Etzioni, and N. Kushman. 2014. "Learning to solve arithmetic word problems with verb categorization" In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014
[4] S. Roy, T. Vieira, and D. Roth. 2015. "Reasoning about quantities in natural language". Transactions of the Association for Computational Linguistics, 3

Comments

Popular posts from this blog

NLP in Video Games

From the last few decades, NLP (Natural Language Processing) has obtained a high level of success in the field  of Computer Science, Artificial Intelligence and Computational Logistics. NLP can also be used in video games, in fact, it is very interesting to use NLP in video games, as we can see games like Serious Games includes Communication aspects. In video games, the communication includes linguistic information that is passed either through spoken content or written content. Now the question is why and where can we use NLP in video games?  There are some games that are related to pedagogy or teaching (Serious Games). So, NLP can be used in these games to achieve these objectives in the real sense. In other games, one can use the speech control using NLP so that the player can play the game by concentrating only on visuals rather on I/O. These things at last increases the realism of the game. Hence, this is the reason for using NLP in games.  We ...

Discourse Analysis

NLP makes machine to understand human language but we are facing issues like word ambiguity, sarcastic sentiments analysis and many more. One of the issue is to predict correctly relation between words like " Patrick went to the club on last Friday. He met Richard ." Here, ' He' refers to 'Patrick'. This kind of issue makes Discourse analysis one of the important applications of Natural Language Processing. What is Discourse Analysis ? The word discourse in linguistic terms means language in use. Discourse analysis may be defined as the process of performing text or language analysis, which involves text interpretation and knowing the social interactions. Discourse analysis may involve dealing with morphemes, n-grams, tenses, verbal aspects, page layouts, and so on. It is often used to refer to the analysis of conversations or verbal discourse. It is useful for performing tasks, like A naphora Resolution (AR) , Named Entity Recognition (NE...

Dbpedia Datasets

WHAT IS Dbpedia? It is a project idea aiming to extract structured content from the information created in the wikipedia project. This structured information is made available on the World Wide Web. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datsets. BUT? But why i am talking about Dbpedia ? How it is related to natural language processing? The DBpedia data set contains 4.58 million entities, out of which 4.22 million are classified in a consistent ontology, including 1,445,000 persons, 735,000 places, 123,000 music albums, 87,000 films, 19,000 video games, 241,000 organizations, 251,000 species and 6,000 diseases. The data set features labels and abstracts for these entities in up to 125 languages; 25.2 million links to images and 29.8 million links to external web pages. In addition, it contains around 50 million links...