| Exploring social annotations for information retrieval |
| Full text |
Pdf
(236 KB)
|
Source
|
International World Wide Web Conference
archive
Proceeding of the 17th international conference on World Wide Web
table of contents
Beijing, China
SESSION: Social networks: applications and infrastructures
table of contents
Pages 715-724
Year of Publication: 2008
ISBN:978-1-60558-085-2
|
|
Authors
|
|
Ding Zhou
|
Facebook Inc., Palo Alto, CA, USA
|
|
Jiang Bian
|
Georgia Institute of Technology, Atlanta, GA, USA
|
|
Shuyi Zheng
|
The Pennsylvania State University, University Park, PA, USA
|
|
Hongyuan Zha
|
Georgia Institute of Technology, Atlanta, GA, USA
|
|
C. Lee Giles
|
The Pennsylvania State University, University Park, PA, USA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 31, Downloads (12 Months): 318, Citation Count: 7
|
|
|
ABSTRACT
Social annotation has gained increasing popularity in many Web-based applications, leading to an emerging research area in text analysis and information retrieval. This paper is concerned with developing probabilistic models and computational algorithms for social annotations. We propose a unified framework to combine the modeling of social annotations with the language modeling-based methods for information retrieval. The proposed approach consists of two steps: (1) discovering topics in the contents and annotations of documents while categorizing the users by domains; and (2) enhancing document and query language models by incorporating user domain interests as well as topical background models. In particular, we propose a new general generative model for social annotations, which is then simplified to a computationally tractable hierarchical Bayesian network. Then we apply smoothing techniques in a risk minimization framework to incorporate the topical information to language models. Experiments are carried out on a real-world annotation data set sampled from del.icio.us. Our results demonstrate significant improvements over the traditional approaches.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
T. Berners-Lee, J. Hendler, and O. Lassila. The semantic web. Scientific American, 284(5):34--43, 2001.
|
| |
2
|
|
 |
3
|
Stephen Dill , Nadav Eiron , David Gibson , Daniel Gruhl , R. Guha , Anant Jhingran , Tapas Kanungo , Sridhar Rajagopalan , Andrew Tomkins , John A. Tomlin , Jason Y. Zien, SemTag and seeker: bootstrapping the semantic web via automated semantic annotation, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary
[doi> 10.1145/775152.775178]
|
| |
4
|
|
| |
5
|
T. Griffiths and M. Steyvers. Finding scientific topics. In National Academy of Sciences, 2004.
|
| |
6
|
A. Hotho, R. Jaschke, C. Schmitz, and G. Stumme. Information retrieval in folksonomies: Search and ranking. In Y. Sure and J. Domingue, editors, The Semantic Web: Research and Applications, volume 4011 of LNAI, pages 411--426, Heidelberg, June 2006. Springer.
|
| |
7
|
|
 |
8
|
|
| |
9
|
F. Jelinek and R. Mercer. Interpolated estimation of markov source parameters from sparse data. In Pattern recognition in Practice, 1980.
|
 |
10
|
|
 |
11
|
|
 |
12
|
John Lafferty , Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.111-119, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383970]
|
| |
13
|
A. K. McCallum. Multi-label text classification with a mixture model trained by em. In AAAI?09 Workshop on Text Learning, 1999.
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
Michal Rosen-Zvi , Thomas Griffiths , Mark Steyvers , Padhraic Smyth, The author-topic model for authors and documents, Proceedings of the 20th conference on Uncertainty in artificial intelligence, p.487-494, July 07-11, 2004, Banff, Canada
|
 |
18
|
Mark Steyvers , Padhraic Smyth , Michal Rosen-Zvi , Thomas Griffiths, Probabilistic author-topic models for information discovery, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
[doi> 10.1145/1014052.1014087]
|
| |
19
|
Tao Tao , Xuanhui Wang , Qiaozhu Mei , ChengXiang Zhai, Language model information retrieval with document expansion, Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.407-414, June 04-09, 2006, New York, New York
[doi> 10.3115/1220835.1220887]
|
 |
20
|
|
 |
21
|
|
 |
22
|
Ding Zhou , Eren Manavoglu , Jia Li , C. Lee Giles , Hongyuan Zha, Probabilistic models for discovering e-communities, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
[doi> 10.1145/1135777.1135807]
|
CITED BY 7
|
|
|
|
|
Jinwen Guo , Shengliang Xu , Shenghua Bao , Yong Yu, Tapping on the potential of q&a community by recommending answer providers, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
Sheila Kinsella , Adriana Budura , Gleb Skobeltsyn , Sebastian Michel , John G. Breslin , Karl Aberer, From Web 1.0 to Web 2.0 and back -: how did your grandma use to tag?, Proceeding of the 10th ACM workshop on Web information and data management, October 30-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
Elizeu Santos-Neto , David Condon , Nazareno Andrade , Adriana Iamnitchi , Matei Ripeanu, Individual and social behavior in tagging systems, Proceedings of the 20th ACM conference on Hypertext and hypermedia, June 29-July 01, 2009, Torino, Italy
|
|
|
|
|