| Learning to recognize reliable users and content in social media with coupled mutual reinforcement |
| Full text |
Pdf
(825 KB)
|
Source
|
International World Wide Web Conference
archive
Proceedings of the 18th international conference on World wide web
table of contents
Madrid, Spain
SESSION: Data mining/session: graph algorithms
table of contents
Pages 51-60
Year of Publication: 2009
ISBN:978-1-60558-487-4
|
|
Authors
|
|
Jiang Bian
|
Georgia Institute of Technology, Atlanta, GA, USA
|
|
Yandong Liu
|
Emory University, Atlanta, GA, USA
|
|
Ding Zhou
|
Facebook Inc., Palo Alto, CA, USA
|
|
Eugene Agichtein
|
Emory University, Atlanta, GA, USA
|
|
Hongyuan Zha
|
Georgia Institute of Technology, Atlanta, GA, USA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 47, Downloads (12 Months): 206, Citation Count: 0
|
|
|
ABSTRACT
Community Question Answering (CQA) has emerged as a popular forum for users to pose questions for other users to answer. Over the last few years, CQA portals such as Naver and Yahoo! Answers have exploded in popularity, and now provide a viable alternative to general purpose Web search. At the same time, the answers to past questions submitted in CQA sites comprise a valuable knowledge repository which could be a gold mine for information retrieval and automatic question answering. Unfortunately, the quality of the submitted questions and answers varies widely - increasingly so that a large fraction of the content is not usable for answering queries. Previous approaches for retrieving relevant and high quality content have been proposed, but they require large amounts of manually labeled data -- which limits the applicability of the supervised approaches to new sites and domains. In this paper we address this problem by developing a semi-supervised coupled mutual reinforcement framework for simultaneously calculating content quality and user reputation, that requires relatively few labeled examples to initialize the training process. Results of a large scale evaluation demonstrate that our methods are more effective than previous approaches for finding high-quality answers, questions, and users. More importantly, our quality estimation significantly improves the accuracy of search over CQA archives over the state-of-the-art methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
Eugene Agichtein , Carlos Castillo , Debora Donato , Aristides Gionis , Gilad Mishne, Finding high-quality content in social media, Proceedings of the international conference on Web search and web data mining, February 11-12, 2008, Palo Alto, California, USA
[doi> 10.1145/1341531.1341557]
|
 |
3
|
|
| |
4
|
|
 |
5
|
Christopher S. Campbell , Paul P. Maglio , Alex Cozzi , Byron Dom, Expertise identification using email communications, Proceedings of the twelfth international conference on Information and knowledge management, November 03-08, 2003, New Orleans, LA, USA
[doi> 10.1145/956863.956965]
|
 |
6
|
|
 |
7
|
R. Guha , Ravi Kumar , Prabhakar Raghavan , Andrew Tomkins, Propagation of trust and distrust, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988727]
|
 |
8
|
|
 |
9
|
Jiwoon Jeon , W. Bruce Croft , Joon Ho Lee , Soyeon Park, A framework to predict the quality of answers with non-textual features, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148212]
|
 |
10
|
|
 |
11
|
|
| |
12
|
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. In Technical report, Stanford Digital Library Technologies Project, 1998.
|
| |
13
|
J. Scott. Social Network Analysis: A Handbook. SAGE Publications, January 2000.
|
 |
14
|
Qi Su , Dmitry Pavlov , Jyh-Herng Chow , Wendell C. Baker, Internet-scale collection of human-reviewed data, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242604]
|
| |
15
|
E. M. Voorhees. Overview of the TREC 2003 question answering track. In Text REtrieval Conference, 2003.
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
|