The effect of non-tightness on Bayesian estimation of PCFGs

Shay B. Cohen, Mark Johnson

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

Abstract

Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the "probabilities" of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of "almost everywhere tight grammars" and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.

LanguageEnglish
Title of host publicationProceedings of the 51st Annual Meeting of the Association for Computational Linguistics
Subtitle of host publicationACL 2013 : 4-9 August, Sofia, Bulgaria
Place of PublicationStroudsburg, PA
PublisherAssociation for Computational Linguistics (ACL)
Pages1033-1041
Number of pages9
Volume1
ISBN (Print)9781937284503
Publication statusPublished - 2013
Event51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 - Sofia, Bulgaria
Duration: 4 Aug 20139 Aug 2013

Other

Other51st Annual Meeting of the Association for Computational Linguistics, ACL 2013
CountryBulgaria
CitySofia
Period4/08/139/08/13

Fingerprint

grammar
Grammar
Markov Chain Monte Carlo
Sampler

Cite this

Cohen, S. B., & Johnson, M. (2013). The effect of non-tightness on Bayesian estimation of PCFGs. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: ACL 2013 : 4-9 August, Sofia, Bulgaria (Vol. 1, pp. 1033-1041). Stroudsburg, PA: Association for Computational Linguistics (ACL).
Cohen, Shay B. ; Johnson, Mark. / The effect of non-tightness on Bayesian estimation of PCFGs. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: ACL 2013 : 4-9 August, Sofia, Bulgaria. Vol. 1 Stroudsburg, PA : Association for Computational Linguistics (ACL), 2013. pp. 1033-1041
@inproceedings{3620041bdf19487baf36be920b4cc9e1,
title = "The effect of non-tightness on Bayesian estimation of PCFGs",
abstract = "Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the {"}probabilities{"} of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of {"}almost everywhere tight grammars{"} and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.",
author = "Cohen, {Shay B.} and Mark Johnson",
year = "2013",
language = "English",
isbn = "9781937284503",
volume = "1",
pages = "1033--1041",
booktitle = "Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics",
publisher = "Association for Computational Linguistics (ACL)",

}

Cohen, SB & Johnson, M 2013, The effect of non-tightness on Bayesian estimation of PCFGs. in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: ACL 2013 : 4-9 August, Sofia, Bulgaria. vol. 1, Association for Computational Linguistics (ACL), Stroudsburg, PA, pp. 1033-1041, 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Sofia, Bulgaria, 4/08/13.

The effect of non-tightness on Bayesian estimation of PCFGs. / Cohen, Shay B.; Johnson, Mark.

Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: ACL 2013 : 4-9 August, Sofia, Bulgaria. Vol. 1 Stroudsburg, PA : Association for Computational Linguistics (ACL), 2013. p. 1033-1041.

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

TY - GEN

T1 - The effect of non-tightness on Bayesian estimation of PCFGs

AU - Cohen, Shay B.

AU - Johnson, Mark

PY - 2013

Y1 - 2013

N2 - Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the "probabilities" of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of "almost everywhere tight grammars" and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.

AB - Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the "probabilities" of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of "almost everywhere tight grammars" and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.

UR - http://www.scopus.com/inward/record.url?scp=84907378489&partnerID=8YFLogxK

M3 - Conference proceeding contribution

SN - 9781937284503

VL - 1

SP - 1033

EP - 1041

BT - Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics

PB - Association for Computational Linguistics (ACL)

CY - Stroudsburg, PA

ER -

Cohen SB, Johnson M. The effect of non-tightness on Bayesian estimation of PCFGs. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: ACL 2013 : 4-9 August, Sofia, Bulgaria. Vol. 1. Stroudsburg, PA: Association for Computational Linguistics (ACL). 2013. p. 1033-1041