Edge-based best-first chart parsing

Eugene Charniak, Sharon Goldwater, Mark Johnson

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

5 Downloads (Pure)

Abstract

Best-first probabilistic chart parsing attempts to parse efficiently by working on edges that are judged "best" by some probabilistic figure of merit (FOM). Recent work has used probabilistic context-free grammars (PCFGs) to assign probabilities to constituents, and to use these probabilities as the starting point for the FOM. This paper extends this approach to using a probabilistic FOM to judge edges (incomplete constituents), thereby giving a much finer grained control over parsing effort. We show how this can be accomplished in a particularly simple way using the common idea of binarizing the PCFG. The results obtained are about a factor of twenty improvement over the best prior results - that is, our parser achieves equivalent results using one twentieth the number of edges. Furthermore we show that this improvement is obtained with parsing precision and recall levels superior to those achieved by exhaustive parsing.
Original languageEnglish
Title of host publicationProceedings of the Sixth Workshop on Very Large Corpora
EditorsEugene Charniak
Place of PublicationSan Francisco
PublisherMorgan Kaufmann
Pages127-133
Number of pages7
Publication statusPublished - 1998
Externally publishedYes
EventWorkshop on Very Large Corpora (6th : 1998) - Université de Montréal, Montreal, Canada
Duration: 15 Aug 199816 Aug 1998

Conference

ConferenceWorkshop on Very Large Corpora (6th : 1998)
Abbreviated titleWVLC 6
CountryCanada
CityMontreal
Period15/08/9816/08/98

Bibliographical note

Copyright the Publisher 1998. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Fingerprint Dive into the research topics of 'Edge-based best-first chart parsing'. Together they form a unique fingerprint.

Cite this