ACM Home Page
Please provide us with feedback. Feedback
Web page classification without the web page
Full text PdfPdf (58 KB)
Source International World Wide Web Conference archive
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters table of contents
New York, NY, USA
POSTER SESSION: Posters table of contents
Pages: 262 - 263  
Year of Publication: 2004
ISBN:1-58113-912-8
Author
Min-Yen Kan  National University of Singapore, Singapore
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 108,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1013367.1013426
What is a DOI?

ABSTRACT

Uniform resource locators (URLs), which mark the address of a resource on the World Wide Web, are often human-readable and can hint at the category of the resource. This paper explores the use of URLs for webpage categorization via a two-phase pipeline of word segmentation/expansion and classification. We quantify its performance against document-based methods, which require the retrieval of the source document.



CITED BY  11