ACM Home Page
Please provide us with feedback. Feedback
Evaluating contents-link coupled web page clustering for web search results
Full text PdfPdf (317 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the eleventh international conference on Information and knowledge management table of contents
McLean, Virginia, USA
SESSION: Web clustering table of contents
Pages: 499 - 506  
Year of Publication: 2002
ISBN:1-58113-492-4
Authors
Yitong Wang  the University of Tokyo, Tokyo, Japan
Masaru Kitsuregawa  the University of Tokyo, Tokyo, Japan
Sponsors
SIGMIS: ACM Special Interest Group on Management Information Systems
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 158,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/584792.584875
What is a DOI?

ABSTRACT

Clustering is currently one of the most crucial techniques for dealing (e.g. resources locating, information interpreting) with massive amount of heterogeneous information on the web. Unlike clustering in other fields, web page clustering separates unrelated pages and clusters related pages (to a specific topic) into semantically meaningful groups, which is useful for discrimination, summarization, organization and navigation of unstructured web pages. We have proposed a contents-link coupled clustering algorithm that clusters web pages by combining contents and link analysis. In this paper, we particularly study the effects of out-links (from the web pages), in-links (to the web page) and terms on the final clustering results as well as how to effectively combine these three parts to improve the quality of clustering results. We apply it to cluster web search results. Preliminary experiments and evaluations are conducted on various topics. As the experimental results show, the proposed clustering algorithm is effective and promising.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
A.V. Leouski et. al. 96 An evaluation of techniques for clustering search results. Technical Report, University of Massachusetts, Amherst
 
3
 
4
5
 
6
Daniel Boley et. al. Partitioning-based Clustering for web document Categorization , , it can be found at www.enterpriseware.net/ EWRoot/Files/ Boley1999a.pdf
 
7
Dharmendra S Modha et.al 00 Clustering hypertext with applications to web search Research Report of IBM Almaden Research Center
 
8
Einat Amitay Using common hypertext links to identify the best phrasal description of target web documents, SIGIR'98 workshop for Hypertext IR for the web
9
 
10
H. Small, Co-citation in the scientific literature: A new measure of the relationship between two documents, J. American Soc. Info. Sci., 24(1973), pp 265--269
11
 
12
 
13
 
14
Lenoard Kaufman and Peter J. Rousseeuw. Finding groups in Data: an introduction to cluster analysis Wiley, 1990
 
15
Michael Steinbach, et. al. A Comparison of Document Clustering techniques KDD'2000.
 
16
M.M. Kessler, Bibliographic coupling between scientific papers American Documentation, 14(1963), pp 10--25
 
17
 
18
Oren Zamir and Oren Etzioni 97 Fast and Intuitive clustering of Web documents, KDD'97
19
 
20
 
21
22
 
23
Taher H.Haveliwa et. al. 99 Scalable techniques for Clustering the Web.
 
24
Taher H.Haveliwa et. al. Similarity Search on the Web: Evaluation and Scalability Considerations Extended Technical Report, 2000
 
25
 
26
Zhihua Jiang et. al. Retriever: Improving Web Search Engine Results Using Clustering

CITED BY  7

Collaborative Colleagues:
Yitong Wang: colleagues
Masaru Kitsuregawa: colleagues