Sampling table configurations for the hierarchical Poisson-Dirichlet Process

Changyou Chen*, Lan Du, Wray Buntine

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

18 Citations (Scopus)

Abstract

Hierarchical modeling and reasoning are fundamental in machine intelligence, and for this the two-parameter Poisson-Dirichlet Process (PDP) plays an important role. The most popular MCMC sampling algorithm for the hierarchical PDP and hierarchical Dirichlet Process is to conduct an incremental sampling based on the Chinese restaurant metaphor, which originates from the Chinese restaurant process (CRP). In this paper, with the same metaphor, we propose a new table representation for the hierarchical PDPs by introducing an auxiliary latent variable, called table indicator, to record which customer takes responsibility for starting a new table. In this way, the new representation allows full exchangeability that is an essential condition for a correct Gibbs sampling algorithm. Based on this representation, we develop a block Gibbs sampling algorithm, which can jointly sample the data item and its table contribution. We test this out on the hierarchical Dirichlet process variant of latent Dirichlet allocation (HDP-LDA) developed by Teh, Jordan, Beal and Blei. Experiment results show that the proposed algorithm outperforms their "posterior sampling by direct assignment" algorithm in both out-of-sample perplexity and convergence speed. The representation can be used with many other hierarchical PDP models.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, Proceedings
Pages296-311
Number of pages16
Volume6911 LNAI
EditionPART 1
DOIs
Publication statusPublished - 2011
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2011 - Athens, Greece
Duration: 5 Sep 20119 Sep 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6911 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

OtherEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2011
CountryGreece
CityAthens
Period5/09/119/09/11

Keywords

  • block Gibbs sampler
  • Dirichlet Processes
  • HDP-LDA
  • Hierarchical Poisson-Dirichlet Processes

Fingerprint Dive into the research topics of 'Sampling table configurations for the hierarchical Poisson-Dirichlet Process'. Together they form a unique fingerprint.

Cite this