Handling conjunctions in named entities

Robert Dale*, Paweł Mazur

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

2 Citations (Scopus)

Abstract

Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain problematic for existing text processing systems. One of these is the ambiguity of conjunctions in candidate named entity strings, an all-too-prevalent problem in corporate and legal documents. In this paper, we distinguish four uses of the conjunction in these strings, and explore the use of a supervised machine learning approach to conjunction disambiguation trained on a very limited set of 'name internal' features that avoids the need for expensive lexical or semantic resources. We achieve 84% correctly classified examples using k-fold evaluation on a data set of 600 instances. Further improvements are likely to require the use of wider domain knowledge and name external features.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 8th International Conference, CICLing 2007, Proceedings
EditorsAlexander Gelbukh
Place of PublicationBerlin ; London
PublisherSpringer, Springer Nature
Pages131-142
Number of pages12
Volume4394 LNCS
ISBN (Print)354070938X, 9783540709381
Publication statusPublished - 2007
Event8th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2007 - Mexico City, Mexico
Duration: 18 Feb 200724 Feb 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4394 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other8th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2007
CountryMexico
CityMexico City
Period18/02/0724/02/07

Fingerprint Dive into the research topics of 'Handling conjunctions in named entities'. Together they form a unique fingerprint.

Cite this