ACM Home Page
Please provide us with feedback. Feedback
Mining multilingual topics from wikipedia
Full text PdfPdf (659 KB)
Source
International World Wide Web Conference archive
Proceedings of the 18th international conference on World wide web table of contents
Madrid, Spain
POSTER SESSION: Thursday, April 23, 2009 table of contents
Pages 1155-1156  
Year of Publication: 2009
ISBN:978-1-60558-487-4
Authors
Xiaochuan Ni  Microsoft Research Asia, Beijing, China
Jian-Tao Sun  Microsoft Research Asia, Beijing, China
Jian Hu  Microsoft Research Asia, Beijing, China
Zheng Chen  Microsoft Research Asia, Beijing, China
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 99,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1526709.1526904
What is a DOI?

ABSTRACT

In this paper, we try to leverage a large-scale and multilingual knowledge base, Wikipedia, to help effectively analyze and organize Web information written in different languages. Based on the observation that one Wikipedia concept may be described by articles in different languages, we adapt existing topic modeling algorithm for mining multilingual topics from this knowledge base. The extracted 'universal' topics have multiple types of representations, with each type corresponding to one language. Accordingly, new documents of different languages can be represented in a space using a group of universal topics, which makes various multilingual Web applications feasible.



Collaborative Colleagues:
Xiaochuan Ni: colleagues
Jian-Tao Sun: colleagues
Jian Hu: colleagues
Zheng Chen: colleagues