|
ABSTRACT
In this paper we propose a hierarchical clustering engine, called snaket, that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hierarchy of labeled folders. The hierarchy offers a complementary view to the flat-ranked list of results returned by current search engines. Users can navigate through the hierarchy driven by their search needs. This is especially useful for informative, polysemous and poor queries.SnakeT is the first complete and open-source system in the literature that offers both hierarchical clustering and folder labeling with variable-length sentences. We extensively test SnakeT against all available web-snippet clustering engines, and show that it achieves efficiency and efficacy performance close to the best known engine Vivisimo.com.Recently, personalized search engines have been introduced with the aim of improving search results by focusing on the users, rather than on their submitted queries. We show how to plug SnakeT on top of any (un-personalized) search engine in order to obtain a form of personalization that is fully adaptive, privacy preserving, scalable, and non intrusive for underlying search engines.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
G. Attardi, A. Gulli, and F. Sebastiani. Theseus: categorization by context. In WWW8, 1999.
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
P. A. Chirita, D. Olmedilla, and W. Nejdl. PROS: A personalized ranking platform for web search. In Int. Conf. on Adaptive Hypermedia and Web-based Syst., 2004.
|
| |
9
|
SnakeT Dataset. http://roquefort.di.unipi.it/gulli/listAllowed/testSnakeT/.
|
| |
10
|
B. Fung, K. Wang, and M. Ester. Large hierarchical document clustering using frequent itemsets. In SDM03.
|
| |
11
|
F. Giannotti, M. Nanni, and D. Pedreschi. Webcat: Automatic categorization of web search results. In SEBD03.
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
Z. Jiang, A. Joshi, R. Krishnapuram, and L. Yi. Retriever: Improving web search engine results using clustering. In Managing Business with Electronic Commerce 02.
|
 |
18
|
|
 |
19
|
Krishna Kummamuru , Rohit Lotlikar , Shourya Roy , Karan Singal , Raghu Krishnapuram, A hierarchical monothetic document clustering algorithm for summarization and browsing search results, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988762]
|
| |
20
|
|
 |
21
|
|
| |
22
|
Y. S. Maarek, R. Fagin, I. Z. Ben-Shaul, and D. Pelleg. Ephemeral document clustering for web applications. Technical Report RJ 10186, IBM Research, 2000.
|
| |
23
|
M. Meila. Comparing clusterings. Technical Report 418, University of Washington, 2002.
|
| |
24
|
Javed Mostafa. Seeking better web searches. Scientific American, February 2005.
|
| |
25
|
S. Osinski and D. Weiss. Conceptual clustering using lingo algorithm: Evaluation on open directory project data. In IIPWM04, 2004.
|
| |
26
|
SnakeTTest Results. http://roquefort.di.unipi.it/gulli/listAllowed/testing/.
|
| |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
|
| |
31
|
D. Weiss and J. Stefanowski. Web search results clustering in polish: Experimental evaluation of carrot. In IIS03.
|
| |
32
|
Y. Wu and X. Chen. Extracting features from web search returned hits for hierarchical classification. In IKE03.
|
| |
33
|
|
 |
34
|
Hua-Jun Zeng , Qi-Cai He , Zheng Chen , Wei-Ying Ma , Jinwen Ma, Learning to cluster web search results, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
[doi> 10.1145/1008992.1009030]
|
| |
35
|
D. Zhang and Y. Dong. Semantic, hierarchical, online clustering of web search results. In WIDM01
|
CITED BY 19
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yabo Xu , Ke Wang , Benyu Zhang , Zheng Chen, Privacy-enhancing personalized web search, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
Francesco Bonchi , Carlos Castillo , Debora Donato , Aristides Gionis, Topical query decomposition, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
GunWoo Park , JinGi Chae , Dae Hee Lee , SangHoon Lee, Personalized search based on user intention through the hierarchical phrase vector model, Proceedings of the WSEAS International Conference on Applied Computing Conference, p.205-210, May 27-30, 2008, Istanbul, Turkey
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xin Li , Jun Yan , Weiguo Fan , Ning Liu , Shuicheng Yan , Zheng Chen, An online blog reading system by topic clustering and personalized ranking, ACM Transactions on Internet Technology (TOIT), v.9 n.3, p.1-26, July 2009
|
REVIEW
"Anthony Joseph Duben : Reviewer"
Searching the Web for information can be very frustrating. Search engines are based on different models, ranging from the very structured, in which Web sites are cataloged according to predefined hierarchical categories, to very amorphous reports
more...
|