Lexicalized stochastic modeling of constraint-based grammars using log-linear measures and EM training

Stefan Riezler, Detlef Prescher, Jonas Kuhn, Mark Johnson

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

Abstract

We present a new approach to stochastic modeling of constraint-based grammars that is based on loglinear models and uses EM for estimation from unannotated data. The techniques are applied to an LFG grammar for German. Evaluation on an exact match task yields 86% precision for an ambiguity rate of 5.4, and 90% precision on a subcat frame match for an ambiguity rate of 25. Experimental comparison to training from a parsebank shows a 10% gain from EM training. Also, a new class-based grammar lexicalization is presented, showing a 10% gain over unlexicalized models.
Original languageEnglish
Title of host publicationProceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL'00), Hong Kong
Place of PublicationStroudsburg, PA
PublisherAssociation for Computational Linguistics (ACL)
Pages480-487
Number of pages8
DOIs
Publication statusPublished - 2000
Externally publishedYes
EventAnnual Meeting of the Association for Computational Linguistics (38th : 2000) - Hong Kong
Duration: 1 Oct 20008 Oct 2000

Conference

ConferenceAnnual Meeting of the Association for Computational Linguistics (38th : 2000)
CityHong Kong
Period1/10/008/10/00

Bibliographical note

Copyright the Publisher 2000. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Cite this