How do we communicate? Not a minute passes without asking or answering any question. (In IIITD, not even a second. :P)
As machines have become part of our day today communications,
they also need to ask us questions. The question generation uses the Natural
language processing ideas as a backbone. This question generation has many
applications in various fields such as the IVR systems, Tutoring systems, requirement
elicitation activities before development of systems. It can be used to
enhance the learning of students using dialogue-based systems, which help in
deeper learning and understanding. Even used to generate question for our
quizzes.
Techniques
1.
Syntax analysis or
phase structure analysis
The most common approach would be to convert the complex
sentences into simple sentences, identifying parts of speech and the entities
in the sentence using syntactic parser. Depending on the relationship of the
parts of speech and the entities, grammatical rules can be formed which help in
clustering the sentences into different types of questions and then framing the
questions.
2.
Semantic Analysis
Identifies each predicate in a sentence, its associated
arguments and modifiers, and specifies their semantic roles. Semantic analysis
gives better results as human sentences have words with different meaning and
contexts.
Sentence: Delhi is in India.
Default question: Where is Delhi?
India is a country(has a semantic meaning). We can
frame a different question.
In which country is Delhi?
3.
NLU Analysis
There is a very new way to do the same using the NLU
analysis. The central idea behind the NLU analysis “what the sentence is
communicating”. For every new sentence, a corresponding matching sentence pattern
is identified from the template, as opposed to generating questions on every
possible sentence constituent. The matched pattern determines the question that
would be generated. The template generation is the most important task. This
uses dependency parsing and semantic role labelling. The dependency parse
provides a representation of the grammatical relations between individual words
in a sentence. Also, the templates should be such to frame a question on the
major theme of the sentence.
Pattern Distribution |
Assigning Ranks for better questions
To generate better questions, the questions are given ranks depending
on the popularity, usage and aptness of the question. This ranking is based on
human inputs to the best possible question for a given sentence. For a given
sentence, the generated questions are better as per human natural language instincts.
Ranking Question |
Conclusion
The closer the algorithm gets to the natural language by
considering meaning, context and natural ways of use, the better are the
results.
References:
Comments
Post a Comment