CICLing 2014 Proceedings

CICLing 2014 proceedings

CONFIDENTIAL MATERIAL. PLEASE DO NOT DISTRIBUTE!

Click on the number on the right-hand side for the full text. Notes: These numbers roughly correspond to the page numbers in the book, but not exactly. For citation, please see the final ToC. The text of the papers corresponds to the final book, but formatting and layout, including the number of pages, may differ.

Download all LNCS papers in one file. Subfolders correspond to volumes 1 and 2. Again, the numbers only roughly correspond to the page numbers.

NEW Download all non-LNCS papers (posters). Soon we will add here a list where you can download them one by one.

LNCS Frontmatter

In these files, the page numbers are final. For citation, please look up your name in the author index (there, II-123 means Part 2, page 123), and then locate your paper in the table of contents (there, Part 1 goes first, and then Part 2). The ISBN etc. is on the title page (last page of the file) or cover image below.

In the draft preface, you can find the acceptance rate and other info about the conference.

Part 1: LNCS 8403		Part 2: LNCS 8404
Cover image		Cover image
Title page		Title page
Preface (draft!)		Same as for Part 1
Contents		Same as for Part 1, but in the ToC Part II goes before Part I
Author index		Same as for Part 1
On Springer's site		On Springer's site

LNCS Part 1: 8403

Lexical Resources
Using Word Association Norms to Measure Corpus Representativeness	Reinhard Rapp	1
Optimality Theory as a Framework for Lexical Acquisition	Thierry Poibeau	14
Verb clustering for Brazilian Portuguese	Carolina Scarton, Lin Sun, Karin Kipper-Schuler, Magali Sanches Duran, Martha Palmer and Anna Korhonen	25
Spreading Relation Annotations in a Lexical Semantic Network Applied to Radiology	Lionel Ramadier, Manel Zarrouk, Mathieu Lafourcade and Antoine Micheau	41
Issues in Encoding the Writing of Nepal’s Languages	Pat Hall, Bal Krishna Bal, Sagun Dhakwa and Bhim Narayan Regmi	54
Compound Terms and their Multi-Word Variants: Case of German and Russian Languages	Elizaveta Clouet and Beatrice Daille	70
A Fully Automated Approach for Arabic Slang Lexicon Extraction from Microblogs	Hady Elsahar and Samhaa El-Beltagy	81
Simple TF·IDF is not the Best you can get for Regionalism Classification	Hiram Calvo	94
Improved Text Extraction from PDF Documents for Large-Scale Natural Language Processing	Jörg Tiedemann	104
Document Representation
Sentic Parser: A Dependency Relation Based Concept Parser for Concept Level Text Analysis	Soujanya Poria and Alexander Gelbukh	116
Obtaining Better Word Representations via Language Transfer	Changliang Li, Bo Xu, Gaowei Wu, Xiuying Wang, Wendong Ge and Yan Li	131
Exploring Applications of Representation Learning in Nepali	Anjan Nepal and Alexander Yates	141
Topic Models Incorporating Statistical Word Senses	Guoyu Tang, Yunqing Xia, Jun Sun, Min Zhang and Thomas Fang Zheng	153
How Preprocessing Affects Unsupervised Keyphrase Extraction	Rui Wang, Wei Liu and Chris McDonald	165
Morphology, POS-tagging, and Named Entity Recognition
Methods and Algorithms for Unsupervised Learning of Morphology	Suresh Manandhar and Burcu Can	179
Morphological Analysis of the Bishnupriya Manipuri Language using Finite State Transducers	Nayan Jyoti Kalita, Navanath Saharia and Smriti Kumar Sinha	208
A hybrid approach to the development of part-of-speech tagger for Kafi-noonoo text	Zelalem Mekuria and Yaregal Assabie	216
Modified Differential Evolution for Biochemical Name Recognizer	Utpal Sikdar, Asif Ekbal and Sriparna Saha	227
Syntax and Parsing
Extended CFG formalism for grammar checker and parser development	Daiga Deksne, Raivis Skadiņš and Inguna Skadiņa	239
Dealing with Function Words in Unsupervised Dependency Parsing	David Mareček and Zdeněk Žabokrtský	252
When rules meet bigrams	Eric Wehrli and Luka Nerima	264
Methodology for Connecting Nouns to their Modifying Adjectives	Nir Ofek, Lior Rokach and Prasenjit Mitra	274
Constituency Parsing of Complex Noun Sequences in Hindi	Arpita Batra, Soma Paul and Amba Kulkarni	288
Amharic Sentence Parsing Using Base Phrase Chunking	Abeba Ibrahim and Yaregal Assabie	300
Anaphora resolution
A Machine Learning Approach to Pronomial Anaphora Resolution in Dialogue based Intelligent Tutoring Systems	Nobal B. Niraula and Vasile Rus	310
A Maximum Entropy based Honorificity Identification for Bengali Pronominal Anaphora Resolution	Apurbalal Senapati and Utpal Garain	322
Recognizing Textual Entailment
Statistical Relational Learning to Recognise Textual Entailment	Miguel Rios and Lucia Specia	333
Annotation Game for Textual Entailment Evaluation	Zuzana Neverilova	343
Semantics and Discourse
Axiomatizing Complex Concepts from Fundamentals	Jerry Hobbs and Andrew Gordon	355
A Semantics-Oriented Grammar for Chinese Treebanking	Meishan Zhang, Yue Zhang, Wanxiang Che and Ting Liu	368
Unsupervised Interpretation of Eventive Propositions	Anselmo Peñas, Bernardo Cabaleiro and Mirella Lapata	381
Sense-Specific Implicative Commitments	Gerard de Melo and Valeria de Paiva	393
A Tiered Approach to the Recognition of Metaphor	David Bracewell, Marc Tomlinson, Michael Mohler and Bryan Rink	405
Knowledge discovery with CRF-based clustering of named entities without a priori classes	Vincent Claveau and Abir Ncibi	417
Semi-supervised SRL system with Bayesian inference	Alejandra Lorenzo and Christophe Cerisara	433
A Sentence Similarity Method based on Chunking and Information Content	Dan Ştefănescu, Rajendra Banjade and Vasile Rus	446
An Investigation on the Influence of Genres and Textual Organizations on the Use of Discourse Relations	Felix-Herve Bachand, Elnaz Davoodi and Leila Kosseim	458
Discourse Tagging for Indian Languages	Sobha Lalitha Devi, Lakshmi S and Sindhuja Gopalan	470
Natural Language Generation
Classification-based Referring Expression Generation	Thiago Ferreira and Ivandre Paraboni	482
Generating Relational Descriptions involving Mutual Disambiguation	Caio Teixeira, Ivandré Paraboni, Adriano Silva and Alan Yamasaki	494
Bayesian Inverse Reinforcement Learning for Modeling Conversational Agents in a Virtual Environment	Lina Rojas and Christophe Cerisara	505
Learning to summarize time series data	Pranay Kumar Venkata Sowdaboina, Sutanu Chakraborti and Sripada Somayajulu G	517

LNCS Part 2: 8404

Sentiment Analysis and Emotion Recognition
Sentence-Level Sentiment Analysis in the Presence of Modalities	Yang Liu, Xiaohui Yu, Bing Liu and Zhongshuai Chen	1
Word-Level Emotion Recognition using High-Level Features	Johanna Moore, Leimin Tian and Catherine Lai	17
Constructing Context-aware Sentiment Lexicons with an Asynchronous Game with a Purpose	Marina Boia, Claudiu Cristian Musat and Boi Faltings	33
Acknowledging Discourse Function for Sentiment Analysis	Phillip Smith and Mark Lee	46
A Method of Polarity Computation of Chinese Sentiment Words Based on Gaussian Distribution	Ruijing Li, Shumin Shi, Heyan Huang, Chao Su and Tianhang Wang	54
A Sentence Vector based Over-sampling Method for Imbalanced Emotion Classiﬁcation	Tao Chen, Ruifeng Xu, Qin Lu, Bin Liu, Jun Xu and Lin Yao	64
News Reader’s Emotion Prediction Using Concept and Concept Sequence Features in Headline	Ruifeng Xu, Jun Xu, Bin Liu, Lin Yao and Qin Lu	76
Emotions target in health forums	Sandra Bringay, Eric Kergosien, Pierre Pompidor and Pascal Poncelet	88
Investigating the Role of Emotion-based Features in Author Gender Classification of Text	Calkin Suero Montero, Tuomo Kakkonen and Myriam Munezero	101
Opinion Mining and Social Networks
A Review Corpus for Argumentation Analysis	Henning Wachsmuth, Martin Trenkmann, Benno Stein, Gregor Engels and Tsvetomira Palakarska	118
Looking for Opinion in Land-use Planning Corpora	Eric Kergosien, Cédric Lopez, Mathieu Roche and Maguelonne Teisseire	130
Cross-lingual Product Recommendation Using Collaborative Filtering With Translation Pairs	Kanako Komiya, Shohei Shibata and Yoshiyuki Kotani	143
Identifying a Demand towards a Company in Consumer-Generated Media	Yuta Kikuchi, Hiroya Takamura, Manabu Okumura and Satoshi Nakazawa	155
Standardizing Tweets with Character-level Machine Translation	Nikola Ljubešić, Tomaž Erjavec and Darja Fišer	166
#impressme: The Language of Motivation in User Generated Content	Marc Tomlinson, Wayne Krug, David Hinote and David Bracewell	178
Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge	Miao Fan, Qiang Zhou and Thomas Fang Zheng	190
Website Community Mining from Query Logs with Two-phase Clustering	Lidong Bing, Wai Lam, Shoaib Jameel and Chunliang Lu	203
Extracting Social Events based on Timeline and User Reliability Analysis on Twitter	Bayar Tsolmon and Kyung-Soon Lee	215
Machine Translation and Multilingualism
Beam-Width Adaptation for Hierarchical Phrase-Based Translation	Su Fei, Gang Chen and Xinyan Xiao	226
Training phrase-based SMT without explicit word alignment	Cyrine Nasri, Kamel Smaili and Chiraz Latiri	235
Role of Paraphrases in PB-SMT	Santanu Pal, Pintu Lohar and Sudip Kumar Naskar	245
Inferring Paraphrases for a Highly Inflected Language from a Monolingual Corpus	Kfir Bar and Nachum Dershowitz	257
Improving Egyptian-to-English SMT by mapping Egyptian into MSA	Nadir Durrani, Yaser Al-Onaizan and Abraham Ittycheriah	274
Bilingually Learning Word Senses for Translation	Joao Casteleiro, Gabriel Pereira Lopes and Joaquim Silva	286
Iterative Bilingual Lexicon Extraction from Comparable Corpora with Topical and Contextual Knowledge	Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi	299
Improving Bilingual Lexicon Extraction from Comparable Corpora using Window-based and Syntax-based Models	Amir Hazem and Emmanuel MORIN	312
An IR-based strategy for supporting Chinese-Portuguese translation services in off-line mode	Martha Ruiz Costa-Jussà, Rafael E. Banchs and Alexander Gelbukh	326
Cross Lingual Snippet Generation using Snippet Translation System	Pintu Lohar, Pinaki Bhaskar, Santanu Pal and Sivaji Bandyopadhyay	333
A Novel Machine Translation Method for Learning Chinese as a Foreign Language	Tiansi Dong and Armin B. Cremers	345
Information Retrieval
A New Relevance Feedback Algorithm Based on Vector Space Basis Change	Rabeb Mbarek, Mohamed Tmar and Hawete Hattab	357
How Complementary Are Different Information Retrieval Techniques? - A Study in Biomedicine Domain	Xiangdong An and Nick Cercone	369
Performance of Turkish Information Retrieval: Evaluating the Impact of Linguistic Parameters and Compound Nouns	Hatem Haddad and Bechikh Ali Chedi	382
Text Classification and Clustering
How Document Properties Affect Document Relatedness Measures	Jessica Perrie, Aminul Islam and Evangelos Milios	393
Multi-attribute classification of text documents as a tool for ranking and categorization of educational innovation projects	Alexey An, Bakytkan Dauletbakov and Eugene Levner	405
Named Entities as new Features for Czech Document Classification	Pavel Kral	418
A Knowledge-poor Approach to Turkish Text Categorization	Savas Yildirim	430
Credible or Incredible? Dissecting Urban Legends	Marco Guerini and Carlo Strapparava	443
Intelligent Clustering Scheme for Log Data Streams	Basanta Joshi, Manoj Ghimire and Umanga Bista	456
Text Summarization
Graph Ranking on Maximal Frequent Sequences for Single Extractive Text Summarization	Yulia Ledeneva, René Arnulfo García-Hernández and Alexander Gelbukh	468
Plagiarism Detection
A Graph Based Automatic Plagiarism Detection Technique to Handle The Artificial Word Reordering and Paraphrasing	Niraj Kumar	483
Identification of Plagiarism using Syntactic and Semantic Filters	Vijay Sundar Ram, Efstathios Stamatatos and Sobha Lalitha Devi	497
Style and Spelling Checking
Text Readability Classification of Bangla Texts	Zahrul Islam, Md. Rashedur Rahman and Alexander Mehler	509
State-of-the-Art in Weighted Finite-State Spell-Checking	Tommi Pirinen and Krister Lindén	521
Spelling Correction for Kazakh	Aibek Makazhanov, Olzhas Makhambetov, Islam Sabyrgaliyev and Zhandos Yessenbayev	535
Speech Processing
A preliminary study on the VOT patterns of the Assamese language and its Nalbaria variety	Sanghamitra Nath, Himangshu Sarma and Utpal Sharma	544
Applications
Evaluation of Sentence Compression Techniques Against Human Performance	Prasad Perera and Leila Kosseim	555
Automatically Assessing Children Written Skills Based on Age-supervised Datasets	Nelly Moreno, Sergio Jimenez and Julia Baquero	566

CICLing 2014 proceedings

LNCS Frontmatter

Part 1: LNCS 8403

Part 2: LNCS 8404

LNCS Part 1: 8403

Lexical Resources

Document Representation

Syntax and Parsing

Anaphora resolution

Recognizing Textual Entailment

Semantics and Discourse

Natural Language Generation

LNCS Part 2: 8404

Sentiment Analysis and Emotion Recognition

Opinion Mining and Social Networks

Machine Translation and Multilingualism

Information Retrieval

Text Classification and Clustering

Text Summarization

Plagiarism Detection

Style and Spelling Checking

Speech Processing

Applications