Topic segmentation with a structured topic model

Lan Du, Wray Buntine, Mark Johnson

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

Abstract

We present a new hierarchical Bayesian model for unsupervised topic segmentation. This new model integrates a point-wise boundary sampling algorithm used in Bayesian segmentation into a structured topic model that can capture a simple hierarchical topic structure latent in documents. We develop an MCMC inference algorithm to split/merge segment(s). Experimental results show that our model outperforms previous unsupervised segmentation methods using only lexical information on Choi's datasets and two meeting transcripts and has performance comparable to those previous methods on two written datasets.

LanguageEnglish
Title of host publicationNAACL HLT 2013
Subtitle of host publication2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference
Place of PublicationStroudsburg, PA
PublisherAssociation for Computational Linguistics (ACL)
Pages190-200
Number of pages11
ISBN (Electronic)9781937284473
Publication statusPublished - 2013
Event2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013 - Atlanta, United States
Duration: 9 Jun 201314 Jun 2013

Other

Other2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
CountryUnited States
CityAtlanta
Period9/06/1314/06/13

Fingerprint

Sampling
Segmentation
segmentation
performance
Split
Markov Chain Monte Carlo
Bayesian Model
Inference

Cite this

Du, L., Buntine, W., & Johnson, M. (2013). Topic segmentation with a structured topic model. In NAACL HLT 2013: 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference (pp. 190-200). Stroudsburg, PA: Association for Computational Linguistics (ACL).
Du, Lan ; Buntine, Wray ; Johnson, Mark. / Topic segmentation with a structured topic model. NAACL HLT 2013: 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Stroudsburg, PA : Association for Computational Linguistics (ACL), 2013. pp. 190-200
@inproceedings{31a18ffd74544e09a39f5f21e9647c5b,
title = "Topic segmentation with a structured topic model",
abstract = "We present a new hierarchical Bayesian model for unsupervised topic segmentation. This new model integrates a point-wise boundary sampling algorithm used in Bayesian segmentation into a structured topic model that can capture a simple hierarchical topic structure latent in documents. We develop an MCMC inference algorithm to split/merge segment(s). Experimental results show that our model outperforms previous unsupervised segmentation methods using only lexical information on Choi's datasets and two meeting transcripts and has performance comparable to those previous methods on two written datasets.",
author = "Lan Du and Wray Buntine and Mark Johnson",
year = "2013",
language = "English",
pages = "190--200",
booktitle = "NAACL HLT 2013",
publisher = "Association for Computational Linguistics (ACL)",

}

Du, L, Buntine, W & Johnson, M 2013, Topic segmentation with a structured topic model. in NAACL HLT 2013: 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Association for Computational Linguistics (ACL), Stroudsburg, PA, pp. 190-200, 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013, Atlanta, United States, 9/06/13.

Topic segmentation with a structured topic model. / Du, Lan; Buntine, Wray; Johnson, Mark.

NAACL HLT 2013: 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Stroudsburg, PA : Association for Computational Linguistics (ACL), 2013. p. 190-200.

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

TY - GEN

T1 - Topic segmentation with a structured topic model

AU - Du, Lan

AU - Buntine, Wray

AU - Johnson, Mark

PY - 2013

Y1 - 2013

N2 - We present a new hierarchical Bayesian model for unsupervised topic segmentation. This new model integrates a point-wise boundary sampling algorithm used in Bayesian segmentation into a structured topic model that can capture a simple hierarchical topic structure latent in documents. We develop an MCMC inference algorithm to split/merge segment(s). Experimental results show that our model outperforms previous unsupervised segmentation methods using only lexical information on Choi's datasets and two meeting transcripts and has performance comparable to those previous methods on two written datasets.

AB - We present a new hierarchical Bayesian model for unsupervised topic segmentation. This new model integrates a point-wise boundary sampling algorithm used in Bayesian segmentation into a structured topic model that can capture a simple hierarchical topic structure latent in documents. We develop an MCMC inference algorithm to split/merge segment(s). Experimental results show that our model outperforms previous unsupervised segmentation methods using only lexical information on Choi's datasets and two meeting transcripts and has performance comparable to those previous methods on two written datasets.

UR - http://www.scopus.com/inward/record.url?scp=84907030582&partnerID=8YFLogxK

M3 - Conference proceeding contribution

SP - 190

EP - 200

BT - NAACL HLT 2013

PB - Association for Computational Linguistics (ACL)

CY - Stroudsburg, PA

ER -

Du L, Buntine W, Johnson M. Topic segmentation with a structured topic model. In NAACL HLT 2013: 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Stroudsburg, PA: Association for Computational Linguistics (ACL). 2013. p. 190-200