img

تفاصيل البطاقة الفهرسية

The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies

مقال من تأليف: Blei, David M. ; Griffiths, Thomas L. ; Jordan, Michael I. ;

ملخص: Abstract. We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of in?nitely deep, in?nitely branching trees.We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Speci?cally, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm ?nds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scienti?c abstracts from several journals. This model exempli?es a recent trend in statistical machine learning?the use of Bayesian nonparametric methods to infer distributions on ?exible data structures.


لغة: إنجليزية