ACM Home Page
Please provide us with feedback. Feedback
A comparison of implicit and explicit links for web page classification
Full text PdfPdf (178 KB)
Source International World Wide Web Conference archive
Proceedings of the 15th international conference on World Wide Web table of contents
Edinburgh, Scotland
SESSION: Data mining classification table of contents
Pages: 643 - 650  
Year of Publication: 2006
ISBN:1-59593-323-9
Authors
Dou Shen  Hong Kong University of Science and Technology, Kowloon, Hong Kong
Jian-Tao Sun  Microsoft Research Asia, Beijing, P.R.China
Qiang Yang  Hong Kong University of Science and Technology, Kowloon, Hong Kong
Zheng Chen  Microsoft Research Asia, Beijing, P.R.China
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 102,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1135777.1135871
What is a DOI?

ABSTRACT

It is well known that Web-page classification can be enhanced by using hyperlinks that provide linkages between Web pages. However, in the Web space, hyperlinks are usually sparse, noisy and thus in many situations can only provide limited help in classification. In this paper, we extend the concept of linkages from explicit hyperlinks to implicit links built between Web pages. By observing that people who search the Web with the same queries often click on different, but related documents together, we draw implicit links between Web pages that are clicked after the same queries. Those pages are implicitly linked. We provide an approach for automatically building the implicit links between Web pages using Web query logs, together with a thorough comparison between the uses of implicit and explicit links in Web page classification. Our experimental results on a large dataset confirm that the use of the implicit links is better than using explicit links in classification performance, with an increase of more than 10.5% in terms of the Macro-F1 measurement.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
 
5
6
 
7
 
8
9
 
10
 
11
T. Joachims. Learning to classify text using support vector machines. Dissertation, Kluwer, 2002.
 
12
A. McCallum and K. Nigam. A comparison of event models for naive bayes text classification. In AAAI-98 Workshop on Learning for Text Categorization, 1998.
 
13
14
 
15
C. Quek. Classification of world wide web documents. Thesis, School of Computer Science, CMU, 1997.
16
17
 
18
 
19
20
 
21
 
22

CITED BY  8

Collaborative Colleagues:
Dou Shen: colleagues
Jian-Tao Sun: colleagues
Qiang Yang: colleagues
Zheng Chen: colleagues