ACM Home Page
Please provide us with feedback. Feedback
A latent topic model for linked documents
Full text PdfPdf (475 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval table of contents
Boston, MA, USA
POSTER SESSION: Posters table of contents
Pages 720-721  
Year of Publication: 2009
ISBN:978-1-60558-483-6
Authors
Zhen Guo  State University of New York at Binghamton, Binghamton, NY, USA
Shenghuo Zhu  NEC Laboratories America, Inc., Cupertino, CA, USA
Yun Chi  NEC Laboratories America, Inc., Cupertino, CA, USA
Zhongfei Zhang  State University of New York at Binghamton, Binghamton, NY, USA
Yihong Gong  NEC Laboratories America, Inc., Cupertino, CA, USA
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 24,   Downloads (12 Months): 82,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1571941.1572095
What is a DOI?

ABSTRACT

Documents in many corpora, such as digital libraries and webpages, contain both content and link information. To explicitly consider the document relations represented by links, in this paper we propose a citation-topic (CT) model which assumes a probabilistic generative process for corpora. In the CT model a given document is modeled as a mixture of a set of topic distributions, each of which is borrowed (cited) from a document that is related to the given document. Moreover, the CT model contains a random process for selecting the related documents according to the structure of the generative model determined by links and therefore, the transitivity of the relations among documents is captured. We apply the CT model on the document clustering task and the experimental comparisons against several state-of-the-art approaches demonstrate very promising performances.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
D. A. Cohn and T. Hofmann. The missing link -- a probabilistic model of document content and hypertext connectivity. In NIPS, pages 430--436, 2000.
4
 
5
 
6
7

Collaborative Colleagues:
Zhen Guo: colleagues
Shenghuo Zhu: colleagues
Yun Chi: colleagues
Zhongfei Zhang: colleagues
Yihong Gong: colleagues