A Particle Filter algorithm for Bayesian word segmentation

Benjamin Börschinger, Mark Johnson

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

60 Downloads (Pure)

Abstract

Bayesian models are usually learned using batch algorithms that have to iterate multiple times over the full dataset. This is both computationally expensive and, from a cognitive point of view, highly implausible. We present a novel online algorithm for the word segmentation models of Goldwater et al. (2009) which is, to our knowledge, the first published version of a Particle Filter for this kind of model. Also, in contrast to other proposed algorithms, it comes with a theoretical guarantee of optimality if the number of particles goes to infinity. While this is, of course, a theoretical point, a first experimental evaluation of our algorithm shows that, as predicted, its performance improves with the use of more particles, and that it performs competitively with other online learners proposed in Pearl et al. (2011)
Original languageEnglish
Title of host publicationProceedings of the 2011 Australasian Language Technology Workshop
EditorsDiego Molla, David Martinez
Place of PublicationCanberra, Australia
PublisherAustralasian Language Technology Association
Pages10-18
Number of pages9
Publication statusPublished - 2011
EventAustralasian Language Technology Association Workshop (9th : 2011) - Canberra
Duration: 1 Dec 20112 Dec 2011

Publication series

NameAustralasian Language Technology Association Workshop
Volume9
ISSN (Print)1834-7037

Workshop

WorkshopAustralasian Language Technology Association Workshop (9th : 2011)
CityCanberra
Period1/12/112/12/11

Bibliographical note

Copyright the Publisher 2011. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Cite this