ACM Home Page
Please provide us with feedback. Feedback
Optimizing web search using web click-through data
Full text PdfPdf (232 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the thirteenth ACM international conference on Information and knowledge management table of contents
Washington, D.C., USA
SESSION: IR-2 (information retrieval): web information retrieval table of contents
Pages: 118 - 126  
Year of Publication: 2004
ISBN:1-58113-874-1
Authors
Gui-Rong Xue  Shanghai Jiao-Tong University, Shanghai, P.R.China
Hua-Jun Zeng  Microsoft Research Asia, Beijing, P.R.China
Zheng Chen  Microsoft Research Asia, Beijing, P.R.China
Yong Yu  Shanghai Jiao-Tong University, Shanghai, P.R.China
Wei-Ying Ma  Microsoft Research Asia, Beijing, P.R.China
WenSi Xi  Virginia Polytechnic Institute and State University, VA
WeiGuo Fan  Virginia Polytechnic Institute and State University, VA
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 42,   Downloads (12 Months): 248,   Citation Count: 37
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1031171.1031192
What is a DOI?

ABSTRACT

The performance of web search engines may often deteriorate due to the diversity and noisy information contained within web pages. User click-through data can be used to introduce more accurate description (metadata) for web pages, and to improve the search performance. However, noise and incompleteness, sparseness, and the volatility of web pages and queries are three major challenges for research work on user click-through log mining. In this paper, we propose a novel iterative reinforced algorithm to utilize the user click-through data to improve search performance. The algorithm fully explores the interrelations between queries and web pages, and effectively finds "virtual queries" for web pages and overcomes the challenges discussed above. Experiment results on a large set of MSN click-through log data show a significant improvement on search performance over the naive query log mining algorithm as well as the baseline search engine.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Brian D.D., David, G.D., and David B.L. Finding Relevant Website Queries, in Proceedings of the Twelfth International World Wide Web Conference, 2003.
 
3
 
4
Cui H., Wen J.R., Nie J.Y., and Ma W.Y., Query Expansion by Mining User Logs, IEEE Transaction on Knowledge and Data Engineering, Vol. 15, No. 4, July/August 2003.
5
6
7
8
 
9
H. Small. Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24:265--269, 1973.
10
11
 
12
M. M. Kessler. Bibliographic coupling between scientific papers. American Documentation, 14:10--25, 1963.
 
13
MSN Search Engine, http://www.msn.com.
14
15
 
16
Porter, M. An algorithm for suffix stripping. Program, Vol. 14(3), pp. 130--137, 1980.
 
17
 
18
Robertson, S.E. et al. Okapi at TREC-3. In Overview of the Third Text REtrieval Conference(TREC-3), 109--126, 1995.
 
19
R. R. Larson. Bibliometrics of the World-Wide Web: An exploratory analysis of the intellectual structure of cyberspace. In Proceedings of the Annual Meeting of the American Society for Information Science, Baltimore, Maryland, October 1996.
 
20
 
21
 
22
Thijs W., Wessel K., and Djoerd H., Retrieving Web Pages using Content, Links, URLs and Anchors, TREC10, 2002.
23

CITED BY  37

Collaborative Colleagues:
Gui-Rong Xue: colleagues
Hua-Jun Zeng: colleagues
Zheng Chen: colleagues
Yong Yu: colleagues
Wei-Ying Ma: colleagues
WenSi Xi: colleagues
WeiGuo Fan: colleagues