Abstract
We present a PCFG parsing algorithm that uses a multilevel coarse-to-fine (mlctf) scheme to improve the efficiency of search for the best parse. Our approach requires the user to specify a sequence of nested partitions or equivalence classes of the PCFG nonterminals. We define a sequence of PCFGs corresponding to each partition, where the nonterminals of each PCFG are clusters of nonterminals of the original source PCFG. We use the results of parsing at a coarser level (i.e., grammar defined in terms of a coarser partition) to prune the next finer level. We present experiments showing that with our algorithm the work load (as measured by the total number of constituents processed) is decreased by a factor of ten with no decrease in parsing accuracy compared to standard CKY parsing with the original PCFG. We suggest that the search space over mlctf algorithms is almost totally unexplored so that future work should be able to improve significantly on these results.
Original language | English |
---|---|
Title of host publication | HLT-NAACL 2006 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings of the Main Conference |
Place of Publication | New York |
Publisher | ACM |
Pages | 168-175 |
Number of pages | 8 |
Publication status | Published - 2006 |
Externally published | Yes |
Event | 2006 Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, HLT-NAACL 2006 - New York, NY, United States Duration: 4 Jun 2006 → 9 Jun 2006 |
Other
Other | 2006 Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting, HLT-NAACL 2006 |
---|---|
Country/Territory | United States |
City | New York, NY |
Period | 4/06/06 → 9/06/06 |