Reviews and Comments on Paper 60

Paper information

Paper #60: Sonia Vázquez, Zornitsa Kozareva and Andrés Montoyo. Textual Entailment beyond Semantic Similarity Information

Abstract: The variability of semantic expression is a special characteristic of natural language. This variability is challenging for many natural language processing applications that try to infer the same meaning from different text variants. In order to treat this problem a generic task has been proposed: Textual Entailment Recognition. In this paper, we present a new Textual Entailment approach based on Latent Semantic Indexing (LSI) and the cosine measure. This proposed approach extracts semantic knowledge from different corpora and resources. Our main purpose is to study how the acquired information can be combined with an already developed and tested Machine Learning Entailment system (MLEnt). The experiments show that the combination of MLEnt, LSI and cosine measure improves the results of the initial approach.
(file)

Summary of received reviews and comments

Reviews superseded by other reviews are shown in the grey color in the table.

	confidence	score
Review 1	4	2

Review 2	3	2
Review 3	2	2

Reviews and Comments

Review 1

PC member:
Reviewer:
Overall rating:	2 (accept: I will argue for this paper)
Confidence:	4
Relevance: Is this paper relevant for this conference?	2 (accept (I will argue for this paper))
Soundness: Is this paper technically sound and complete?	2 (accept (I will argue for this paper))
Are the claims sufficiently supported by experimental/theoretical results?	3 (strong accept)
Significance: Are the results/ideas interesting for other AI researchers?	2 (accept (I will argue for this paper))
Originality: Are the results or ideas novel and previously unknown?	2 (accept (I will argue for this paper))
Readability: Is the paper well-organized and easy to understand?	1 (weak accept (vote accept but don't mind rejecting))
Language: Is the paper written in correct English and style?	1 (weak accept (vote accept but don't mind rejecting))
Format: Is the paper correctly and consistently formatted?	2 (accept (I will argue for this paper))
Review:	CONTRIBUTION OF THE PAPER: This paper describes an evaluation of two techniques to undertake the task of textual entailment. In concret, Latent Semantic Analysis (LSI) and the Cosine Measure. Moreover, these techniques were added into a previously developed textual entailment system named MLEnt and the improvements of these additions has been also evaluated. Textual entailment has been recently defined as a common solution for the modeling of language variability in different NLP applications. The evaluation is undertaken under the PASCAL framework, that is a good framework for covering a broad range of semantic-oriented inferences needed for the practical applications. The task is an interesting one, as yet not widely known in the community, with many implications for task like QA, IE among others. For these reasons, I consider that this paper is worth of acceptance. POSITIVE ASPECTS: The authors describe a novel approach which explots the semantic capability of LSI. Although the LSI technique has already been applied to the resolution of the Textual Entailment task (see E. Newman et al., in RTE1 2005), the authors' approach of building several term-document matrices in order to obtain the semantic information is an interesting and well functioning idea, especially the experiment with the generation of the matrices from the text-hypothesis corpora. The authors also study the influence of lematization while resolving the textual entailment problem. The application of the cosine measure in the textual entailment task is also an original idea. Normally this measure has been used in combination with word frequencies, while the authors in this paper encountered another approach where the relevant domains extracted from the WordNet Domains has been used. The authors conducted an exhaustive experimental study which demonstrated the influence of these two measures applied to already existing machine-learning based textual entailment system. In this expermined was demonstrated that the machine learning system improves its performance. Therefore, I consider that the new information which the LSI and cosine measure introduce to the classifier, improve the performance and helps the MLEnt system to determine correctly more text-hypothesis TE pairs. I found extreemly interesting the future work section in which the authors will use the LSI method to obtain synonym, antonym and other types of word relations. NEGATIVE ASPECTS: I feel the lack of a brief background, focussing over much on recent PASCAL workshops plus a couple of key recent references which triggered interst in entailment as a task. In the introduction section, the authors cite several related work, but nothing is illustrated of these references. I felt uncertain about how the relevant document for the semantic space from the BNC corpus are obtained (see section 2.1). Perhaps the authors feel this as obvious, unfortunately it was not obvious to me! It would be better if you explain it in details. In section 2.3 the authors extract the information of WordNet glosses and assign to each word its associated domain. I suppose this process is automatic, but I don't know what happens when in different glosses associated with different domains appear the same term. In this case, is the term added to two different domains? Explain this process in details. In the section of the experiments and results, the results in the Table 2 are quite clear, but no comments are shown about the reduction of accuracy when the LSI-BNC technique is applied. Finally, I would like to have an objective opinion about how the performance of this system is comparable to the current textual entailment systems. For this purpose, I suggest to the authors that they should do a comparative table with other systems. Maybe it is impossible because of the restricted number of pages. CHANGES TO IMPROVE THE PAPER: The whole paper would benefit from careful proofreading. I recommend the authors to avoid the excessive using of the conjunctions "So", "that is to say" and "in order to". The acronyms LSI and BNC are introduced several times in the paper, avoid repetitions. In the table 3, the acronym DF has not been previously introduced. I suppose that it is the meaning of "document frecuency". Is the MLEnt system previously developed by the authors? I couldn't find any reference in the paper. Some errors about upper-case appear in the references. And in the 4th reference the author referenced is Peñas. FURTHER COMMENTS: In the future, I suggest to the authors to focus on the exploration of the combination of the MLEnt system and the newly developed LSI and cosine measures through a hybrid strategy and also to compare this influence to the already presented feature-based integration. ITEMS BELOW ARE JUSTIFICATION OF THE SCORES IF NEGATIVE: (1) IS THIS PAPER RELEVANT FOR THIS CONFERENCE? (2) IS THIS PAPER TECHNICALLY SOUND AND COMPLETE? (3) ARE THE CLAIMS SUFFICIENTLY SUPPORTED BY EXPERIMENTAL OR THEORETICAL RESULTS? (4) ARE THE RESULTS/IDEAS INTERESTING FOR OTHER AI RESEARCHERS? (5) ARE THE RESULTS OR IDEAS NOVEL AND PREVIOUSLY UNKNOWN? (6) IS THE PAPER WELL-ORGANIZED AND EASY TO UNDERSTAND? (7) IS THE PAPER WRITTEN IN CORRECT ENGLISH AND STYLE? (8) IS THE PAPER CORRECTLY AND CONSISTENTLY FORMATTED?
PC only:
Time:	Jul 14, 13:57

Review 2

PC member:
Overall rating:	2 (accept: I will argue for this paper)
Confidence:	3
Relevance: Is this paper relevant for this conference?	3 (strong accept)
Soundness: Is this paper technically sound and complete?	1 (weak accept (vote accept but don't mind rejecting))
Are the claims sufficiently supported by experimental/theoretical results?	2 (accept (I will argue for this paper))
Significance: Are the results/ideas interesting for other AI researchers?	2 (accept (I will argue for this paper))
Originality: Are the results or ideas novel and previously unknown?	1 (weak accept (vote accept but don't mind rejecting))
Readability: Is the paper well-organized and easy to understand?	1 (weak accept (vote accept but don't mind rejecting))
Language: Is the paper written in correct English and style?	2 (accept (I will argue for this paper))
Format: Is the paper correctly and consistently formatted?	2 (accept (I will argue for this paper))
Review:	CONTRIBUTION OF THE PAPER: Takes advantage of the novel approach of using a corpora for TE which consist only on the TE text/Hypotesis sentences. POSITIVE ASPECTS: Good ideas to improve well known algorithms NEGATIVE ASPECTS: Adding the relevant domains information does not seem to improve results very much. There is not a comparison with other systems of RTE2. In some cases it does not even reach the performance of the average system in RTE, however in other cases it is significatively better (particularly for QA dev test set) It does not explain clearly why the great difference of performance between test set and dev set in several cases. CHANGES TO IMPROVE THE PAPER: page 4 says "adjetive" instead of adjective page 6 says "Table1" instead of Table 1
PC only:
Time:	Jul 16, 04:42

Review 3

PC member:
Reviewer:
Overall rating:	2 (accept: I will argue for this paper)
Confidence:	2
Relevance: Is this paper relevant for this conference?	2 (accept (I will argue for this paper))
Soundness: Is this paper technically sound and complete?	2 (accept (I will argue for this paper))
Are the claims sufficiently supported by experimental/theoretical results?	2 (accept (I will argue for this paper))
Significance: Are the results/ideas interesting for other AI researchers?	2 (accept (I will argue for this paper))
Originality: Are the results or ideas novel and previously unknown?	1 (weak accept (vote accept but don't mind rejecting))
Readability: Is the paper well-organized and easy to understand?	2 (accept (I will argue for this paper))
Language: Is the paper written in correct English and style?	2 (accept (I will argue for this paper))
Format: Is the paper correctly and consistently formatted?	2 (accept (I will argue for this paper))
Review:	CONTRIBUTION OF THE PAPER: They present a Textual Entailment approach based on Latent Semantic Indexing and the cosine measure. POSITIVE ASPECTS: The method considers the extraction of semantic knowledge from corpora and resources. They analize the way to improve their results combining them with a Machine Learning Entailment system NEGATIVE ASPECTS: No comparison of the results with other methods CHANGES TO IMPROVE THE PAPER: Explaining what is the difference between this work and that of Deerwester et al When speaking of comparison of your method with others, please mention the (knowlegde-rich) method for measuring the similarity between texts based on semantic graphs, as described, e.g., in: M. Montes y Gómez, A. Gelbukh, A. López López, R. Baeza-Yates. Flexible Comparison of Conceptual Graphs. Lecture Notes in Computer Science N 2113, Springer-Verlag, pp. 102-111; http://nlp.cic.ipn.mx/Publications/2001/DEXA-2001-Flexible.htm.
PC only:
Time:	Aug 10, 06:42