Reviews and Comments on Paper 199

Paper information

Paper #199: Zornitsa Kozareva and Andrés Montoyo. An approach for Textual Entailment Recognition based on Stacking and Voting
Abstract: This paper presents a machine-learning approach for the recognition of textual entailment. For our approach we modell lexical and semantic features. We study the effect of stacking and voting joint classifier combination techniques which boost the final performance of the system. In an exhaustive experimental evaluation, the performance of the developed approach is measured. The obtained results demonstrate that an ensemble of classifiers achieves higher accuracy than an individual classifier and comparable results to already existing textual entailment systems.
(file)

Summary of received reviews and comments

Reviews superseded by other reviews are shown in the grey color in the table.

        confidence score
Review 1       2 2
Review 2       3 2
Review 3       3 3
 
   


Reviews and Comments

Review 1

PC member:  
Overall rating: 2 (accept: I will argue for this paper)
Confidence: 2
Relevance: Is this paper relevant for this conference? 2 (accept (I will argue for this paper))
Soundness: Is this paper technically sound and complete? 1 (weak accept (vote accept but don't mind rejecting))
Are the claims sufficiently supported by experimental/theoretical results? 2 (accept (I will argue for this paper))
Significance: Are the results/ideas interesting for other AI researchers? 1 (weak accept (vote accept but don't mind rejecting))
Originality: Are the results or ideas novel and previously unknown? 1 (weak accept (vote accept but don't mind rejecting))
Readability: Is the paper well-organized and easy to understand? 2 (accept (I will argue for this paper))
Language: Is the paper written in correct English and style? 2 (accept (I will argue for this paper))
Format: Is the paper correctly and consistently formatted? 2 (accept (I will argue for this paper))
Review: CONTRIBUTION OF THE PAPER:

Authors describes their contribution as the desig of lexical and semantic features that measure the similarity of two texts to determine whether the texts entail each other or not. They create complementary classifiers and study how to combine them throguh stacking and voting schemes.

The novelty of the idea is limited, from a technical point of view authors apply a set of well known techniques based on similarity measures.

POSITIVE ASPECTS:

The text is clear and easy to understand and it also includes a section devoted to experimental results with comparative tests. The work is a pretty example of NLP application.

NEGATIVE ASPECTS:

It is not clearly especified where is the origin of the corpora considered (subsection 4.1).

Comparative study 8section 5) is limited to a previous proposal, and authors do not justify the choice of this one in order to compare their approach.

CHANGES TO IMPROVE THE PAPER:

Authors should underline the real practical and theoretical aspects of their contribution and clearly detail the nature of the corpora considered. The choice of the tools compared in the experimental section should also be explained.

FURTHER COMMENTS:

In my opinion this is a preliminary paper.

ITEMS BELOW ARE JUSTIFICATION OF THE SCORES IF NEGATIVE:

(1) IS THIS PAPER RELEVANT FOR THIS CONFERENCE?



(2) IS THIS PAPER TECHNICALLY SOUND AND COMPLETE?



(3) ARE THE CLAIMS SUFFICIENTLY SUPPORTED BY EXPERIMENTAL OR THEORETICAL RESULTS?



(4) ARE THE RESULTS/IDEAS INTERESTING FOR OTHER AI RESEARCHERS?



(5) ARE THE RESULTS OR IDEAS NOVEL AND PREVIOUSLY UNKNOWN?



(6) IS THE PAPER WELL-ORGANIZED AND EASY TO UNDERSTAND?



(7) IS THE PAPER WRITTEN IN CORRECT ENGLISH AND STYLE?



(8) IS THE PAPER CORRECTLY AND CONSISTENTLY FORMATTED?
PC only: In my opinion this is only a preliminary paper.
Time: Jul 3, 10:33

Review 2

PC member:  
Reviewer:  
Overall rating: 2 (accept: I will argue for this paper)
Confidence: 3
Relevance: Is this paper relevant for this conference? 3 (strong accept)
Soundness: Is this paper technically sound and complete? 2 (accept (I will argue for this paper))
Are the claims sufficiently supported by experimental/theoretical results? 3 (strong accept)
Significance: Are the results/ideas interesting for other AI researchers? 2 (accept (I will argue for this paper))
Originality: Are the results or ideas novel and previously unknown? 2 (accept (I will argue for this paper))
Readability: Is the paper well-organized and easy to understand? 1 (weak accept (vote accept but don't mind rejecting))
Language: Is the paper written in correct English and style? 1 (weak accept (vote accept but don't mind rejecting))
Format: Is the paper correctly and consistently formatted? 2 (accept (I will argue for this paper))
Review: CONTRIBUTION OF THE PAPER:
This paper shows a machine-learning approach for textual entailment tasks. The features are based on text summarization techiniques and similarity measures. The experiments are well organized and their conclusions are very intesting for further researchers.


POSITIVE ASPECTS:
The set of features and the obtained results can be applied in ML systems as well as rule-based systems.
The results are in the average of current textual entailment sustems, and the ML proposal is really a novelty.


NEGATIVE ASPECTS:
The use of some features must be better justified.


CHANGES TO IMPROVE THE PAPER:
- the use of " in latex is wrong. Replace with `` and ''
- page 2, paragraph 3, reviled --> revised
- page 2, last paragraph, showing feature 2. The example about the rabbit  illustrates that the feature is not enough sensitive and therefore features 3 and 4 are proposed. According to this, the use of feature 2 is not justified. Authors could provide another example to illustrate the utility of this feature.
- page 4, footnotes have different sizes.
- References are gravely uncompleted. Most of them lack the year, pages and publishing info.

FURTHER COMMENTS:



ITEMS BELOW ARE JUSTIFICATION OF THE SCORES IF NEGATIVE:

(1) IS THIS PAPER RELEVANT FOR THIS CONFERENCE?



(2) IS THIS PAPER TECHNICALLY SOUND AND COMPLETE?



(3) ARE THE CLAIMS SUFFICIENTLY SUPPORTED BY EXPERIMENTAL OR THEORETICAL RESULTS?



(4) ARE THE RESULTS/IDEAS INTERESTING FOR OTHER AI RESEARCHERS?



(5) ARE THE RESULTS OR IDEAS NOVEL AND PREVIOUSLY UNKNOWN?



(6) IS THE PAPER WELL-ORGANIZED AND EASY TO UNDERSTAND?



(7) IS THE PAPER WRITTEN IN CORRECT ENGLISH AND STYLE?



(8) IS THE PAPER CORRECTLY AND CONSISTENTLY FORMATTED?
PC only:  
Time: Jul 14, 18:17

Review 3

PC member:  
Overall rating: 3 (strong accept)
Confidence: 3
Relevance: Is this paper relevant for this conference? 2 (accept (I will argue for this paper))
Soundness: Is this paper technically sound and complete? 3 (strong accept)
Are the claims sufficiently supported by experimental/theoretical results? 3 (strong accept)
Significance: Are the results/ideas interesting for other AI researchers? 2 (accept (I will argue for this paper))
Originality: Are the results or ideas novel and previously unknown? 2 (accept (I will argue for this paper))
Readability: Is the paper well-organized and easy to understand? 2 (accept (I will argue for this paper))
Language: Is the paper written in correct English and style? 2 (accept (I will argue for this paper))
Format: Is the paper correctly and consistently formatted? 2 (accept (I will argue for this paper))
Review: CONTRIBUTION OF THE PAPER:
Presents a new complimentary classifiers schema for Textual Entailment tasks obtaining a slightly higher performance.


POSITIVE ASPECTS:
Paper has good theoretical information, it is full of information (which can make it a little bit difficult to read), and everything is combined in several experiments clearly explained.


NEGATIVE ASPECTS:
The fact of using an ensamble of classifiers that achieve better performance than each one by itself it is not new in general. Perhaps it has not been applied before to the TE task.


CHANGES TO IMPROVE THE PAPER:
Minor English flaws: modell, reviled, reviel, wheather.

Page 2 says that the results suggest that ML for TE is possible. In general, we know this is possible. Please particularize in your particular ML technique.

Reference [5]: year?

Compare your approach to comparsion of texts with a knowledge-rich approach such as: M. Montes-y-Gómez, A. Gelbukh, A. López-López. Comparison of Conceptual Graphs. Lecture Notes in Artificial Intelligence, N 1793, Springer-Verlag, pp. 548-556; http://nlp.cic.ipn.mx/Publications/2000/MICAI-2000-Comparison.htm.
PC only:  
Time: Aug 10, 07:02