CICLing 2014 Proceedings

CICLing 2014 proceedings


Click on the number on the right-hand side for the full text. Notes: These numbers roughly correspond to the page numbers in the book, but not exactly. For citation, please see the final ToC. The text of the papers corresponds to the final book, but formatting and layout, including the number of pages, may differ.

Download all LNCS papers in one file. Subfolders correspond to volumes 1 and 2. Again, the numbers only roughly correspond to the page numbers.

 NEW  Download all non-LNCS papers (posters). Soon we will add here a list where you can download them one by one.

LNCS Frontmatter

In these files, the page numbers are final. For citation, please look up your name in the author index (there, II-123 means Part 2, page 123), and then locate your paper in the table of contents (there, Part 1 goes first, and then Part 2). The ISBN etc. is on the title page (last page of the file) or cover image below.

In the draft preface, you can find the acceptance rate and other info about the conference.

Part 1: LNCS 8403


Part 2: LNCS 8404

Cover image   Cover image
Title page   Title page
Preface (draft!)   Same as for Part 1
Contents   Same as for Part 1, but in the ToC Part II goes before Part I
Author index   Same as for Part 1
On Springer's site   On Springer's site

LNCS Part 1: 8403

Lexical Resources

Using Word Association Norms to Measure Corpus Representativeness Reinhard Rapp 1
Optimality Theory as a Framework for Lexical Acquisition Thierry Poibeau 14
Verb clustering for Brazilian Portuguese Carolina Scarton, Lin Sun, Karin Kipper-Schuler, Magali Sanches Duran, Martha Palmer and Anna Korhonen 25
Spreading Relation Annotations in a Lexical Semantic Network Applied to Radiology Lionel Ramadier, Manel Zarrouk, Mathieu Lafourcade and Antoine Micheau 41
Issues in Encoding the Writing of Nepal’s Languages Pat Hall, Bal Krishna Bal, Sagun Dhakwa and Bhim Narayan Regmi 54
Compound Terms and their Multi-Word Variants: Case of German and Russian Languages Elizaveta Clouet and Beatrice Daille 70
A Fully Automated Approach for Arabic Slang Lexicon Extraction from Microblogs Hady Elsahar and Samhaa El-Beltagy 81
Simple TF·IDF is not the Best you can get for Regionalism Classification Hiram Calvo 94
Improved Text Extraction from PDF Documents for Large-Scale Natural Language Processing Jörg Tiedemann 104

Document Representation

Sentic Parser: A Dependency Relation Based Concept Parser for Concept Level Text Analysis Soujanya Poria and Alexander Gelbukh 116
Obtaining Better Word Representations via Language Transfer Changliang Li, Bo Xu, Gaowei Wu, Xiuying Wang, Wendong Ge and Yan Li 131
Exploring Applications of Representation Learning in Nepali Anjan Nepal and Alexander Yates 141
Topic Models Incorporating Statistical Word Senses Guoyu Tang, Yunqing Xia, Jun Sun, Min Zhang and Thomas Fang Zheng 153
How Preprocessing Affects Unsupervised Keyphrase Extraction Rui Wang, Wei Liu and Chris McDonald 165
Morphology, POS-tagging, and Named Entity Recognition
Methods and Algorithms for Unsupervised Learning of Morphology Suresh Manandhar and Burcu Can 179
Morphological Analysis of the Bishnupriya Manipuri Language using Finite State Transducers Nayan Jyoti Kalita, Navanath Saharia and Smriti Kumar Sinha 208
A hybrid approach to the development of part-of-speech tagger for Kafi-noonoo text Zelalem Mekuria and Yaregal Assabie 216
Modified Differential Evolution for Biochemical Name Recognizer Utpal Sikdar, Asif Ekbal and Sriparna Saha 227

Syntax and Parsing

Extended CFG formalism for grammar checker and parser development Daiga Deksne, Raivis Skadiņš and Inguna Skadiņa 239
Dealing with Function Words in Unsupervised Dependency Parsing David Mareček and Zdeněk Žabokrtský 252
When rules meet bigrams Eric Wehrli and Luka Nerima 264
Methodology for Connecting Nouns to their Modifying Adjectives Nir Ofek, Lior Rokach and Prasenjit Mitra 274
Constituency Parsing of Complex Noun Sequences in Hindi Arpita Batra, Soma Paul and Amba Kulkarni 288
Amharic Sentence Parsing Using Base Phrase Chunking Abeba Ibrahim and Yaregal Assabie 300

Anaphora resolution

A Machine Learning Approach to Pronomial Anaphora Resolution in Dialogue based Intelligent Tutoring Systems Nobal B. Niraula and Vasile Rus 310
A Maximum Entropy based Honorificity Identification for Bengali Pronominal Anaphora Resolution Apurbalal Senapati and Utpal Garain 322

Recognizing Textual Entailment

Statistical Relational Learning to Recognise Textual Entailment Miguel Rios and Lucia Specia 333
Annotation Game for Textual Entailment Evaluation Zuzana Neverilova 343

Semantics and Discourse

Axiomatizing Complex Concepts from Fundamentals Jerry Hobbs and Andrew Gordon 355
A Semantics-Oriented Grammar for Chinese Treebanking Meishan Zhang, Yue Zhang, Wanxiang Che and Ting Liu 368
Unsupervised Interpretation of Eventive Propositions Anselmo Peñas, Bernardo Cabaleiro and Mirella Lapata 381
Sense-Specific Implicative Commitments Gerard de Melo and Valeria de Paiva 393
A Tiered Approach to the Recognition of Metaphor David Bracewell, Marc Tomlinson, Michael Mohler and Bryan Rink 405
Knowledge discovery with CRF-based clustering of named entities without a priori classes Vincent Claveau and Abir Ncibi 417
Semi-supervised SRL system with Bayesian inference Alejandra Lorenzo and Christophe Cerisara 433
A Sentence Similarity Method based on Chunking and Information Content Dan Ştefănescu, Rajendra Banjade and Vasile Rus 446
An Investigation on the Influence of Genres and Textual Organizations on the Use of Discourse Relations Felix-Herve Bachand, Elnaz Davoodi and Leila Kosseim 458
Discourse Tagging for Indian Languages Sobha Lalitha Devi, Lakshmi S and Sindhuja Gopalan 470

Natural Language Generation

Classification-based Referring Expression Generation Thiago Ferreira and Ivandre Paraboni 482
Generating Relational Descriptions involving Mutual Disambiguation Caio Teixeira, Ivandré Paraboni, Adriano Silva and Alan Yamasaki 494
Bayesian Inverse Reinforcement Learning for Modeling Conversational Agents in a Virtual Environment Lina Rojas and Christophe Cerisara 505
Learning to summarize time series data Pranay Kumar Venkata Sowdaboina, Sutanu Chakraborti and Sripada Somayajulu G 517


LNCS Part 2: 8404

Sentiment Analysis and Emotion Recognition

Sentence-Level Sentiment Analysis in the Presence of Modalities Yang Liu, Xiaohui Yu, Bing Liu and Zhongshuai Chen 1
Word-Level Emotion Recognition using High-Level Features Johanna Moore, Leimin Tian and Catherine Lai 17
Constructing Context-aware Sentiment Lexicons with an Asynchronous Game with a Purpose Marina Boia, Claudiu Cristian Musat and Boi Faltings 33
Acknowledging Discourse Function for Sentiment Analysis Phillip Smith and Mark Lee 46
A Method of Polarity Computation of Chinese Sentiment Words Based on Gaussian Distribution Ruijing Li, Shumin Shi, Heyan Huang, Chao Su and Tianhang Wang 54
A Sentence Vector based Over-sampling Method for Imbalanced Emotion Classification Tao Chen, Ruifeng Xu, Qin Lu, Bin Liu, Jun Xu and Lin Yao 64
News Reader’s Emotion Prediction Using Concept and Concept Sequence Features in Headline Ruifeng Xu, Jun Xu, Bin Liu, Lin Yao and Qin Lu 76
Emotions target in health forums Sandra Bringay, Eric Kergosien, Pierre Pompidor and Pascal Poncelet 88
Investigating the Role of Emotion-based Features in Author Gender Classification of Text Calkin Suero Montero, Tuomo Kakkonen and Myriam Munezero 101

Opinion Mining and Social Networks

A Review Corpus for Argumentation Analysis Henning Wachsmuth, Martin Trenkmann, Benno Stein, Gregor Engels and Tsvetomira Palakarska 118
Looking for Opinion in Land-use Planning Corpora Eric Kergosien, Cédric Lopez, Mathieu Roche and Maguelonne Teisseire 130
Cross-lingual Product Recommendation Using Collaborative Filtering With Translation Pairs Kanako Komiya, Shohei Shibata and Yoshiyuki Kotani 143
Identifying a Demand towards a Company in Consumer-Generated Media Yuta Kikuchi, Hiroya Takamura, Manabu Okumura and Satoshi Nakazawa 155
Standardizing Tweets with Character-level Machine Translation Nikola Ljubešić, Tomaž Erjavec and Darja Fišer 166
#impressme: The Language of Motivation in User Generated Content Marc Tomlinson, Wayne Krug, David Hinote and David Bracewell 178
Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge Miao Fan, Qiang Zhou and Thomas Fang Zheng 190
Website Community Mining from Query Logs with Two-phase Clustering Lidong Bing, Wai Lam, Shoaib Jameel and Chunliang Lu 203
Extracting Social Events based on Timeline and User Reliability Analysis on Twitter Bayar Tsolmon and Kyung-Soon Lee 215

Machine Translation and Multilingualism

Beam-Width Adaptation for Hierarchical Phrase-Based Translation Su Fei, Gang Chen and Xinyan Xiao 226
Training phrase-based SMT without explicit word alignment Cyrine Nasri, Kamel Smaili and Chiraz Latiri 235
Role of Paraphrases in PB-SMT Santanu Pal, Pintu Lohar and Sudip Kumar Naskar 245
Inferring Paraphrases for a Highly Inflected Language from a Monolingual Corpus Kfir Bar and Nachum Dershowitz 257
Improving Egyptian-to-English SMT by mapping Egyptian into MSA Nadir Durrani, Yaser Al-Onaizan and Abraham Ittycheriah 274
Bilingually Learning Word Senses for Translation Joao Casteleiro, Gabriel Pereira Lopes and Joaquim Silva 286
Iterative Bilingual Lexicon Extraction from Comparable Corpora with Topical and Contextual Knowledge Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi 299
Improving Bilingual Lexicon Extraction from Comparable Corpora using Window-based and Syntax-based Models Amir Hazem and Emmanuel MORIN 312
An IR-based strategy for supporting Chinese-Portuguese translation services in off-line mode Martha Ruiz Costa-Jussà, Rafael E. Banchs and Alexander Gelbukh 326
Cross Lingual Snippet Generation using Snippet Translation System Pintu Lohar, Pinaki Bhaskar, Santanu Pal and Sivaji Bandyopadhyay 333
A Novel Machine Translation Method for Learning Chinese as a Foreign Language Tiansi Dong and Armin B. Cremers 345

Information Retrieval

A New Relevance Feedback Algorithm Based on Vector Space Basis Change Rabeb Mbarek, Mohamed Tmar and Hawete Hattab 357
How Complementary Are Different Information Retrieval Techniques? - A Study in Biomedicine Domain Xiangdong An and Nick Cercone 369
Performance of Turkish Information Retrieval: Evaluating the Impact of Linguistic Parameters and Compound Nouns Hatem Haddad and Bechikh Ali Chedi 382

Text Classification and Clustering

How Document Properties Affect Document Relatedness Measures Jessica Perrie, Aminul Islam and Evangelos Milios 393
Multi-attribute classification of text documents as a tool for ranking and categorization of educational innovation projects Alexey An, Bakytkan Dauletbakov and Eugene Levner 405
Named Entities as new Features for Czech Document Classification Pavel Kral 418
A Knowledge-poor Approach to Turkish Text Categorization Savas Yildirim 430
Credible or Incredible? Dissecting Urban Legends Marco Guerini and Carlo Strapparava 443
Intelligent Clustering Scheme for Log Data Streams Basanta Joshi, Manoj Ghimire and Umanga Bista 456

Text Summarization

Graph Ranking on Maximal Frequent Sequences for Single Extractive Text Summarization Yulia Ledeneva, René Arnulfo García-Hernández and Alexander Gelbukh 468

Plagiarism Detection

A Graph Based Automatic Plagiarism Detection Technique to Handle The Artificial Word Reordering and Paraphrasing Niraj Kumar 483
Identification of Plagiarism using Syntactic and Semantic Filters Vijay Sundar Ram, Efstathios Stamatatos and Sobha Lalitha Devi 497

Style and Spelling Checking

Text Readability Classification of Bangla Texts Zahrul Islam, Md. Rashedur Rahman and Alexander Mehler 509
State-of-the-Art in Weighted Finite-State Spell-Checking Tommi Pirinen and Krister Lindén 521
Spelling Correction for Kazakh Aibek Makazhanov, Olzhas Makhambetov, Islam Sabyrgaliyev and Zhandos Yessenbayev 535

Speech Processing

A preliminary study on the VOT patterns of the Assamese language and its Nalbaria variety Sanghamitra Nath, Himangshu Sarma and Utpal Sharma 544


Evaluation of Sentence Compression Techniques Against Human Performance Prasad Perera and Leila Kosseim 555
Automatically Assessing Children Written Skills Based on Age-supervised Datasets Nelly Moreno, Sergio Jimenez and Julia Baquero 566