A. Gelbukh (Ed.) Computational Linguistics and Intelligent Text Processing (CICLing-2004). Lecture Notes in Computer Science, Vol. 2945, Springer-Verlag, 2004.

Contents

The page numbers are tentative and may be slightly different in the real book.

Only the abstract of each paper is available online. The full paper can be purchased from Springer-Verlag.

 

v

Preface (Committees, Additional Reviewers, Organization and Contact)

 

Computational Linguistics

 

Computational Linguistics Formalisms

1

Towards an LFG Syntax-Semantics Interface for Frame Semantics Annotation

 

Anette Frank, Katrin Erk

13

Projections from Morphology to Syntax in the Korean Resource Grammar: Implementing Typed Feature Structures

 

Jong-Bok Kim, Jaehyung Yang

25

A Systemic-Functional Approach to Japanese Text Understanding

 

Noriko Ito, Toru Sugimoto, Michio Sugeno

37

Building and Using a Russian Resource Grammar in GF

 

Janna Khegai, Aarne Ranta

41

An Application of a Semantic Framework for the Analysis of Chinese Sentences

 

Li Tang, Donghong Ji, Yu Nie, Lingpeng Yang

 

Semantics and Dialogue

45

A Modal Logic Framework for Human-Computer Spoken Interaction

 

Luis Villaseñor-Pineda, Manuel Montes-y-Gómez, Jean Caelen

54

Agents Interpreting Imperative Sentences

 

Miguel Pérez-Ramírez, Chris Fox

66

Intention Retrieval with a Socially-supported Belief System

 

Naoko Matsumoto, Akifumi Tokosum

70

Extracting Domain Knowledge for Dialogue Model Adaptation

 

Kuei-Kuang Lin, Hsin-Hsi Chen

 

Syntax and Parsing

79

A Probabilistic Chart Parser implemented with an Evolutionary Algorithm

 

Lourdes Araujo

91

Probabilistic Shift-Reduce Parsing Model Using Rich Contextual Information

 

Yong-Jae Kwak, So-Young Park, Joon-Ho Lim, Hae-Chang Rim

95

Evaluation of Feature Combination for Effective Structural Disambiguation

 

So-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim

99

Parsing incomplete sentences revisited

 

Manuel Vilares, Victor M. Darriba, Jesús Vilares

109

Unlexicalized Dependency Parser for Variable Word Order Languages based on Local Contextual Pattern

 

Hoojung Chung, Hae-Chang Rim

121

A cascaded syntactic analyser for Basque

 

Itziar Aduriz, Maxux J. Aranzabe, Jose Mari Arriola, Arantza Díaz de Ilarraza, Koldo Gojenola, Maite Oronoz, Larraitz Uria

 

Lexical Analysis

132

An Analysis of Sentence Boundary Detection Systems for English and Portuguese Documents

 

Carlos N. Silla Jr., Celso A. A. Kaestner

139

Towards Language-Independent Sentence Boundary Detection

 

Do-Gil Lee, Hae-Chang Rim

143

Korean Compound Noun Decomposition Using Syllabic Information Only

 

Seong-Bae Park, Jeong-Ho Chang, Byoung-Tak Zhang

 

Named Entity Recognition

155

Learning Named Entity Classifiers using Support Vector Machines

 

Thamar Solorio, Aurelio López López

165

An Internet-based method for Verification of Extracted Proper Names

 

Angelo Dalli

169

Boundary Correction of Protein Names Adapting Heuristic Rules

 

Tomohiro Mitsumori, Sevrani Fation, Masaki Murata, Kouichi Doi, Hirohumi Doi

 

Word Sense Disambiguation

173

Word Sense Disambiguation Based on Weight Distribution Model with Multiword Expression

 

Hee-Cheol Seo, Young-Sook Hwang, Hae-Chang Rim

185

Combining EWN and sense-untagged corpus for WSD

 

Iulia Nica, Mª. Antònia Martí, Andrés Montoyo, Sonia Vázquez

198

Feature Selection for Chinese Character Sense Discrimination

 

Zheng-Yu Niu, Dong-Hong Ji

206

The Role of Temporal Expressions in Word Sense Disambiguation

 

Sonia Vázquez, Estela Saquete, Andrés Montoyo, Patricio Martínez‑Barco, Rafael Muñoz

 

Anaphora Resolution

210

An Empirical Study on Pronoun Resolution in Chinese

 

Wang Houfeng, Mei Zheng

 

Lexicon and Corpus

214

Language-independent Methods for Compiling Monolingual Lexical Data

 

Christian Biemann, Stefan Bordag, Gerhard Heyer, Uwe Quasthoff, Christian Wolff

226

Getting One’s First Million…  Collocations

 

Igor A. Bolshakov

240

Automatic Syntactic Analysis for Detection of Word Combinations

 

Alexander Gelbukh, Grigori Sidorov, Sang-Yong Han, Erika Hernández-Rubio

245

A Small System Storing Spanish Collocations

 

Igor A. Bolshakov, Sabino Miranda-Jiménez

250

A Semi-Automatic Tree Annotating Workbench for Building a Korean Treebank

 

Joon-Ho Lim, So-Young Park, Yong-Jae Kwak, Hae-Chang Rim

254

Extracting Semantic Categories of Nouns for Syntactic Disambiguation from Human‑Oriented Explanatory Dictionaries

 

Hiram Calvo, Alexander Gelbukh

258

Hierarchies Measuring Qualitative Variables

 

Serguei Levachkine, Adolfo Guzmán-Arenas

 

Bilingual Resources

 

 Invited talk:

271

Substring Alignment using Suffix Trees

 

Martin Kay

 

 Invited talk:

280

Exploiting Hidden Meanings: Using Bilingual Text for Monolingual Annotation

 

Philip Resnik

297

Acquisition of Word Translations Using Local Focus-based Learning in Ainu-Japanese Parallel Corpora

 

Hiroshi Echizen-ya, Kenji Araki, Yoshio Momouchi, Koji Tochinai

302

Sentence Alignment for Spanish-Basque Bitexts: Word Correspondences vs. Markup similarity

 

Arantza Casillas, Idoia Fernández, Raquel Martínez

306

Two-level Alignment by Words and Phrases Based on Syntactic Information

 

Seonho Kim, Juntae Yoon, Dong-Yul Ra

 

Machine Translation

318

Exploiting a Mono-Bilingual Dictionary for English-Korean Translation Selection and Sense Disambiguation

 

Hyun Ah Lee, Juntae Yoon, Gil Chang Kim

330

Source Language Effect on Translating Korean Honorifics

 

Kyonghee Paik, Kiyonori Ohtake, Francis  Bond, Kazuhide Yamamoto

334

An Algorithm for Determining DingYu Structural Particle using Grammar Knowledge and Statistical Information

 

Fuji Ren

 

Natural Language Generation

346

Generating natural word orders in a semi-free word order language: Treebank-based linearization preferences for German

 

Gerard Kempen, Karin Harbusch

351

Guideline for developing a software life cycle process in natural language generation projects

 

Mª del Socorro Bernardos

 

Human-Computer Interaction Applications

355

A plug and play spoken dialogue interface for smart environments

 

Germán Montoro, Xavier Alamán, Pablo A. Haya

366

Evaluation of Japanese Dialogue Processing Method Based on Similarity Measure Using tf · AoI

 

Yasutomo Kimura, Kenji Araki, Koji Tochinai

378

Towards Programming in Everyday Language:  A Case for Email Management

 

Toru Sugimoto, Noriko Ito, Shino Iwashita, Michio Sugeno

 

Speech Recognition and Synthesis

 

 Invited talk:

390

Specifying Affect and Emotion for Expressive Speech Synthesis

 

Nick Campbell

402

Overcoming the Sparseness Problem of Spoken Language Corpora using Other Large corpora of distinct characteristics

 

Sehyeong Cho, SangHun Kim, Jun Park, YoungJik Lee

406

A Syllabification Algorithm for Spanish

 

Heriberto Cuayáhuitl

410

Experiments on the Construction of a Phonetically Balanced Corpus from the Web

 

Luis Villaseñor-Pineda, Manuel Montes-y-Gómez, Dominique Vaufreydaz, Jean-François Serignat

 

Intelligent Text Processing

 

Indexing

414

Head/Modifier Frames for Information Retrieval

 

Cornelis H.A. Koster

427

Performance Analysis of Semantic Indexing in Text Retrieval

 

Bo-Yeong Kang, Hae-Jung Kim, Sang-Jo Lee

431

A Model for Extracting Keywords of Document Using Term Frequency and Distribution

 

Jae-Woo Lee, Doo-Kwon Baik

435

A Combining Approach to Automatic Keyphrases Indexing for Chinese News Documents

 

Wang Houfeng, Li Sujian, Yu Shiwen, Kang Byeong Kwu

 

Information Retrieval

 

 Invited talk:

439

Challenges in the Interaction of Information Retrieval and Natural Language Processing

 

Ricardo Baeza-Yates

451

The Challenge of Creative Information Retrieval

 

Tony Veale

463

Using T-Ret System to Improve Incident Report Retrieval

 

Joe Carthy, David C. Wilson, Ruichao Wang, John Dunnion, Anne Drummond

 

Question Answering and Sentence Retrieval

467

Spanish Question Answering Evaluation

 

Anselmo Peñas, Felisa Verdejo, Jesús Herrera

479

Comparative Analysis of Term Distributions in a Sentence and in a Document for Sentence Retrieval

 

Kyoung-Soo Han, Hae-Chang Rim

 

Browsing

483

Contextual Exploration of Text Collections

 

Manuel Montes-y-Gómez, Manuel Pérez-Coutiño, Luis Villaseñor-Pineda, Aurelio López-López

492

Automatic Classification and Skimming of Articles in a News Video Using Korean Closed-caption

 

Jung-Won Cho, Seung-Do Jeong, Byung-Uk Choi

 

Filtering

496

A Framework for Evaluation of Information Filtering Techniques in an Adaptive Recommender System

 

John O’Donovan, John Dunnion

500

Lexical Chains versus Keywords for Topic Tracking

 

Joe Carthy

504

Filtering Very Similar Text Documents: A Case Study

 

Jiří Hroza, Jan Žižka, Aleš Bourek

 

Information Extraction

515

Using Information Extraction to Build a Directory of Conference Announcements

 

Karl-Michael Schneider

527

Unsupervised Event Extraction from Biomedical Text based on Event and Pattern Information

 

Hong-woo Chun, Young-sook Hwang, Hae-Chang Rim

531

Thai Syllable-Based Information Extraction Using Hidden Markov Models

 

Lalita  Narupiyakul, Calvin Thomas, Nick Cercone, Booncharoen Sirinaovakul

541

The impact of enriched linguistic annotation on the performance of extracting relation triples

 

Sanghee Kim, Paul Lewis, Kirk Martinez

 

Text Categorization

553

An kNN Model-based Approach and its Application in Text Categorization

 

Gongde Guo, Hui Wang, David Bell, Yaxin Bi, Kieran Greer

565

Automatic Learning Features Using Bootstrapping for Text Categorization

 

Chen Wenliang, Zhu Jingbo, Wu Honglin, Yao Tianshun

575

Recomputation of Class Relevance Scores for Improving Text Classification

 

Sang-Bum Kim, Hae-Chang Rim

579

Raising High-Degree Overlapped Character Bigrams into Trigrams for Dimensionality Reduction in Chinese Text Categorization

 

Dejun Xue, Maosong Sun

591

Information Retrieval and Text Categorization with Semantic Indexing

 

Paolo Rosso, Antonio Molina, Ferran Pla, Daniel Jiménez, Vicent Vidal

 

Document Clustering

596

Sampling and Feature Selection in a Genetic Algorithm for Document Clustering

 

Arantza Casillas, Mayte T. González de Lena, Raquel Martínez

608

A New Efficient Clustering Algorithm for Organizing Dynamic Data Collection

 

Kwangcheol Shin, Sangyong Han

612

Domain-informed Topic Detection

 

Cormac Flynn, John Dunnion

 

Summarization

622

Assessing the Impact of Lexical Chain Scoring Methods and Sentence Extraction Schemes on Summarization

 

William Doran, Nicola Stokes, Joe Carthy, John Dunnion

631

A Term Weighting Method based on Lexical Chain for Automatic Summarization

 

Young-In Song, Kyoung-Soo Han, Hae-Chang Rim

 

Language Identification

635

Centroid-Based Language Identification Using Letter Feature Set

 

Hidayet Takcı, İbrahim Soğukpınar

645

Author Index