Squibs and discussions: The DOP estimation method is biased and inconsistent

Mark Johnson*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

33 Citations (Scopus)


A data-oriented parsing or DOP model for statistical parsing associates fragments of linguistic representations with numerical weights, where these weights are estimated by normalizing the empirical frequency of each fragment in a training corpus (see Bod [1998] and references cited therein). This note observes that this estimation method is biased and inconsistent; that is, the estimated distribution does not in general converge on the true distribution as the size of the training corpus increases.

Original languageEnglish
Pages (from-to)71-76
Number of pages6
JournalComputational Linguistics
Issue number1
Publication statusPublished - Mar 2002
Externally publishedYes


Dive into the research topics of 'Squibs and discussions: The DOP estimation method is biased and inconsistent'. Together they form a unique fingerprint.

Cite this