Argumentation Mining: A Machine Learning Perspective

Argumentation is the process whereby arguments are constructed, exchanged and evaluated in light of their interactions with other arguments. An argument is a set of premises, pieces of evidence (e.g. facts), offered in support of a claim.

Figure 1: Argumentation tree-structure

Argumentation plays an important role in many areas. Many professionals, e.g. scientists, lawyers, journalists or managers, implicitly or explicitly handle arguments systematically. They routinely undertake argumentation as an integral part of their work, where they identify pros and cons to analyze situations prior to presenting some information to an audience and prior to making some decision.

Argumentation mining (AM) is a research area that moves between natural language processing, argumentation theory and information retrieval. The goal is to automatically extract argumentation structures from texts in natural language, so that they can be analyzed more closely using computer programs. For argumentative structures, such as the prerequisites, the consequences, the argumentation scheme, the links between the main and secondary argument, or the argument and the counter-argument.

From an application perspective, AM could be considered in some respects as an evolution of sentiment analysis: state that, while the goal of opinion mining is to understand what people think about something, the aim of argumentation mining is to understand why, thus unveiling reasoning processes, rather than just detecting opinions and sentiment.

AM poses a scientifically engaging challenge, especially from a machine learning (ML) perspective. Indeed, AM is a difficult NLP task that merges together many different components, such as information extraction, knowledge representation, and discourse analysis.

Now we will review ML methods for the task of automatically extracting arguments from text.

All the argument mining frameworks proposed can be described as multi-stage pipeline systems, whose input is natural, free text document, and whose output is a markup document, where arguments

(or parts of arguments) are annotated. Each stage addresses a sub-task of the whole argumentation mining problem, by employing one or more machine learning and natural language processing methodologies and techniques.

Methods :

1. Argumentative Sentence Detection :

A first stage usually consists of detecting which sentences in the input document are argumentative. This task is typically implemented by a machine learning classifier (Naive Bayes classifiers, Support Vector Machines, Maximum Entropy classifiers, Logistic Regression, Decision Trees and Random Forests).

A common implementation consists of training a binary classifier, with the goal of simply discarding propositions that are not argumentative, while the second classifier at a later stage in the pipeline will subsequently be trained to distinguish among various argument components (e.g., claims from premises). Alternatively, a single multi-class predictor could be employed to discriminate between all the possible categories of argument elements.

In both cases, two crucial issues within this step involve:

(1) the choice of the classifier, and

(2) the features to be used to describe the sentences.

2. Argumentative Element Detection :

Once the non-argumentative sentences have been discarded by the first stage of the pipeline, it is necessary to exactly detect the argumentative elements, sometimes also called Argumentative Discourse Units (ADUs). Clearly, this phase greatly depends on the underlying adopted argument model, since the AM system must be capable of discriminating all the possible argumentative elements in the considered model.

Due to its simplicity and generality, the premises/conclusion model is usually adopted in the existing AM systems. Regardless of the considered argument model, in addition to the distinction amongst elements, a so-called segmentation problem has to be addressed at this stage of the AM pipeline, since not necessarily a whole sentence exactly corresponds to an argument element. Three different cases can in fact be distinguished:

1. only a portion of the sentence coincides with an argumentative element;

2. two or more argumentative elements can be present within the same sentence;

3. an argumentative element can span across multiple sentences.

3. Argumentative Structure Prediction :

After the detection of the argumentative elements, a further stage in the pipeline has the aim to predict links between arguments, or argument components. If the desired output consists in finding the relations only between argumentative elements, then the system will produce a sort of map of the arguments retrieved in the input textual document. Another possibility is also to infer the connections between arguments, in which case support and attack relations have to be distinguished. This second point is a very important step, as the output of the argumentation mining system could be used as an input to a formal argumentation framework, so that different semantics could be applied to identify sets of arguments with desired characteristics.

Conclusion :

Argumentation mining represents a novel, exciting application domain for machine learning. Nevertheless, despite some promising initial results, there is still a lot of work to be done, in order to exploit all the potential of ML approaches within the AM community, and to build successful applications to be employed as an input to formal argumentation frameworks.

The methods reviewed in this article mostly target homogeneous and domain specific data sources. An interesting direction could be developing AM techniques capable of handling heterogeneous data sources, as well as relational and structured data.

References:

1. Aharoni, E., Polnarov, A., Lavee, T., Hershcovich, D., Levy, R., Rinott, R., Gutfreund, D., Slonim, N.: A benchmark dataset for automatic detection of claims and evidence in the context of controversial topics. In: Proceedings of the First Workshop on Argumentation Mining. pp. 64{68. Association for Computational Linguistics (2014), http://acl2014.org/acl2014/W14-21/pdf/W14-2109.pdf

2. Ashley, K.D., Walker, V.R.: Toward constructing evidence-based legal arguments using legal decision documents and machine learning. In: Francesconi, E., Verheij, B. (eds.) ICAIL 2013, Rome, Italy. pp. 176{180. ACM (2013), http://dl.acm.org/citation.cfm?id=2514622

3. Bench-Capon, T.J.M., Dunne, P.E.: Argumentation in artificial intelligence. Artificial Intelligence 171(10-15), 619{641 (2007), http://dx.doi.org/10.1016/j.artint.2007.05.001

4. Besnard, P., Garc__a, A.J., Hunter, A., Modgil, S., Prakken, H., Simari, G.R., Toni, F.: Introduction to structured argumentation. Argument & Computation 5(1), 1{4(2014), http://dx.doi.org/10.1080/19462166.2013.869764

5. Black, E., Hunter, A.: A relevance-theoretic framework for constructing and deconstructing enthymemes. J. Log. Comput. 22(1), 55{78 (2012)

6. https://de.wikipedia.org/wiki/Argumentation_Mining

Word embeddings and an application in SMT

We all are aware of (not so) recent advancements in word representation, such as Word2Vec, GloVe etc. for various NLP tasks. Let's try to dig a little deeper of how they work, and why they are so helpful! The basics, what is a Word vector? We need a mathematical way of representing words so as to process them. We call this representation, a word vector. This representation can be as simple as a one-hot encoded vector having the size of the vocabulary. For ex, if we had 3 words in our vocabulary {man, woman, child}, we can generate word vectors in the following manner Man : {0, 0, 1} Woman : {0, 1, 0} Child : {1, 0, 0} Such an encoding cannot be used to for any meaningful comparisons, other than checking for equality. In vectors such as Word2Vec, a word is represented as a distribution over some dimensions. Each word is assigned some particular weight for each of the dimensions. Picking up the previous example, this time the vectors can be as following (assuming a 2 dime...

Amalgam

Search This Blog

Argumentation Mining: A Machine Learning Perspective

Comments

Post a Comment

Popular posts from this blog

NLP in Video Games

Discourse Analysis

Word embeddings and an application in SMT