Thumbnail
Access Restriction
Subscribed

Author Blei, David M. ♦ Griffiths, Thomas L. ♦ Jordan, Michael I.
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Copyright Year ©2010
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Bayesian nonparametric statistics ♦ Unsupervised learning
Abstract We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning—the use of Bayesian nonparametric methods to infer distributions on flexible data structures.
ISSN 00045411
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2010-02-08
Publisher Place New York
e-ISSN 1557735X
Journal Journal of the ACM (JACM)
Volume Number 57
Issue Number 2
Page Count 30
Starting Page 1
Ending Page 30


Open content in new tab

   Open content in new tab
Source: ACM Digital Library