| Learning to cluster web search results |
| Full text |
Pdf
(210 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Sheffield, United Kingdom
SESSION: Clustering
table of contents
Pages: 210 - 217
Year of Publication: 2004
ISBN:1-58113-881-4
|
|
Authors
|
|
Hua-Jun Zeng
|
Microsoft Research, Asia, Beijing, P.R. China
|
|
Qi-Cai He
|
Peking University, Beijing, P. R. China
|
|
Zheng Chen
|
Microsoft Research, Asia, Beijing, P.R. China
|
|
Wei-Ying Ma
|
Microsoft Research, Asia, Beijing, P.R. China
|
|
Jinwen Ma
|
Peking University, Beijing, P. R. China
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 64, Downloads (12 Months): 473, Citation Count: 67
|
|
|
ABSTRACT
Organizing Web search results into clusters facilitates users' quick browsing through search results. Traditional clustering techniques are inadequate since they don't generate clusters with highly readable names. In this paper, we reformalize the clustering problem as a salient phrase ranking problem. Given a query and the ranked list of documents (typically a list of titles and snippets) returned by a certain Web search engine, our method first extracts and ranks salient phrases as candidate cluster names, based on a regression model learned from human labeled training data. The documents are assigned to relevant salient phrases to form candidate clusters, and the final clusters are generated by merging these candidate clusters. Experimental results verify our method's feasibility and effectiveness.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
Douglass R. Cutting , David R. Karger , Jan O. Pedersen, Constant interaction-time scatter/gather browsing of very large document collections, Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, p.126-134, June 27-July 01, 1993, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/160688.160706]
|
| |
4
|
Google search engine, (2004) http://www.google.com.
|
| |
5
|
Hastie T., Tibshirani R., and Friedman J. The Elements of Statistical Learning. New York: Springer-Verlag, 2001.
|
 |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
Lent B., Agrawal R., and Srikant R. Discovering Trends in Text Databases. In Proceedings of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining (KDD'97), Newport Beach, California, August 1997.
|
| |
10
|
Leouski A. V. and Croft W. B. An Evaluation of Techniques for Clustering Search Results. Technical Report IR-76, Department of Computer Science, University of Massachusetts, Amherst, 1996.
|
| |
11
|
Leuski A. and Allan J. Improving Interactive Retrieval by Combining Ranked List and Clustering. Proceedings of RIAO, College de France, pp. 665--681, 2000.
|
| |
12
|
MSN search engine, (2004) http://search.msn.com.
|
| |
13
|
Smola, A. J. and Schlkopf, B. A Tutorial on Support Vector Regression. NeuroCOLT2 Technical Report Series, NC2-TR-1998-030. October, 1998.
|
| |
14
|
Vivisimo clustering engine, (2004) http://vivisimo.com.
|
| |
15
|
Yahoo search engine, (2004) http://www.yahoo.com.
|
| |
16
|
|
 |
17
|
|
CITED BY 67
|
|
|
|
|
Xin-Jing Wang , Wei-Ying Ma , Qi-Cai He , Xing Li, Grouping web image search result, Proceedings of the 12th annual ACM international conference on Multimedia, October 10-16, 2004, New York, NY, USA
|
|
|
Jin-Cheon Na , Christopher S. G. Khoo , Syin Chan , Norraihan Bte Hamzah, Sentiment-based search in digital libraries, Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, June 07-11, 2005, Denver, CO, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhigang Hua , Hao Liu , Xing Xie , Hanqing Lu , Wei-Ying Ma, Representing personal web information using a topic-oriented interface, Special interest tracks and posters of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
|
|
|
Jian-Tao Sun , Xuanhui Wang , Dou Shen , Hua-Jun Zeng , Zheng Chen, CWS: a comparative web search system, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
Changhu Wang , Feng Jing , Lei Zhang , Hong-Jiang Zhang, Scalable search-based image annotation of personal images, Proceedings of the 8th ACM international workshop on Multimedia information retrieval, October 26-27, 2006, Santa Barbara, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Feng Jing , Changhu Wang , Yuhuan Yao , Kefeng Deng , Lei Zhang , Wei-Ying Ma, IGroup: web image search results clustering, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|
Xirong Li , Le Chen , Lei Zhang , Fuzong Lin , Wei-Ying Ma, Image annotation by large-scale content-based image retrieval, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
Feng Jing , Changhu Wang , Yuhuan Yao , Kefeng Deng , Lei Zhang , Wei-Ying Ma, IGroup: a web image search engine with semantic clustering of search results, Proceedings of the 14th annual ACM international conference on Multimedia, October 23-27, 2006, Santa Barbara, CA, USA
|
|
|
|
|
|
|
|
|
Rui Li , Shenghua Bao , Yong Yu , Ben Fei , Zhong Su, Towards effective browsing of large scale social annotations, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shuo Wang , Feng Jing , Jibo He , Qixing Du , Lei Zhang, IGroup: presenting web image search results in semantic clusters, Proceedings of the SIGCHI conference on Human factors in computing systems, April 28-May 03, 2007, San Jose, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xu Ling , Jing Jiang , Xin He , Qiaozhu Mei , Chengxiang Zhai , Bruce Schatz, Generating gene summaries from biomedical literature: A study of semi-structured summarization, Information Processing and Management: an International Journal, v.43 n.6, p.1777-1791, November, 2007
|
|
|
|
|
|
|
|
|
|
|
|
Xiaoguang Rui , Mingjing Li , Zhiwei Li , Wei-Ying Ma , Nenghai Yu, Bipartite graph reinforcement model for web image annotation, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|
|
|
|
|
Xu Ling , Qiaozhu Mei , ChengXiang Zhai , Bruce Schatz, Mining multi-faceted overviews of arbitrary topics in a text collection, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
Francesco Bonchi , Carlos Castillo , Debora Donato , Aristides Gionis, Topical query decomposition, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
|
|
|
Ruihua Song , Zhenxiao Luo , Jian-Yun Nie , Yong Yu , Hsiao-Wuen Hon, Identification of ambiguous queries in web search, Information Processing and Management: an International Journal, v.45 n.2, p.216-229, March, 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Scarlett R. Herring , Chia-Chen Chang , Jesse Krantzler , Brian P. Bailey, Getting inspired!: understanding how and why examples are used in creative design practice, Proceedings of the 27th international conference on Human factors in computing systems, April 04-09, 2009, Boston, MA, USA
|
|