Abstract
Many NLP tasks interact with syntax. The presence of a named entity span, for example, is often a clear indicator of a noun phrase in the parse tree, while a span in the syntax can help indicate the lack of a named entity in the spans that cross it. For these types of problems joint inference offers a better solution than a pipelined approach, and yet large joint models are rarely pursued. In this paper we argue this is due in part to the absence of a general framework for joint inference which can efficiently represent syntactic structure. We propose an alternative and novel method in which constituency parse constraints are imposed on the model via combinatorial factors in a Markov random field, guaranteeing that a variable configuration forms a valid tree. We apply this approach to jointly predicting parse and named entity structure, for which we introduce a zero-order semi-CRF named entity recognizer which also relies on a combinatorial factor. At the junction between these two models, soft constraints coordinate between syntactic constituents and named entity spans, providing an additional layer of flexibility on how these models interact. With this architecture we achieve the best-reported results on both CRF-based parsing and named entity recognition on sections of the OntoNotes corpus, and outperform state-of-the-art parsers on an NP-identification task, while remaining asymptotically faster than traditional grammar-based parsers.
Original language | English |
---|---|
Title of host publication | Proceedings of COLING 2012 |
Subtitle of host publication | technical papers |
Editors | Martin Kay, Christian Boitet |
Place of Publication | Mumbai, India |
Publisher | The COLING 2012 Organizing Committee |
Pages | 1995-2010 |
Number of pages | 16 |
Publication status | Published - 2012 |
Event | International Conference on Computational Linguistics (24th : 2012) - Mumbai, India Duration: 8 Dec 2012 → 15 Dec 2012 |
Conference
Conference | International Conference on Computational Linguistics (24th : 2012) |
---|---|
City | Mumbai, India |
Period | 8/12/12 → 15/12/12 |