ACM Home Page
Please provide us with feedback. Feedback
Strategies for minimising errors in hierarchical web categorisation
Full text PdfPdf (161 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the eleventh international conference on Information and knowledge management table of contents
McLean, Virginia, USA
SESSION: Web clustering table of contents
Pages: 525 - 531  
Year of Publication: 2002
ISBN:1-58113-492-4
Authors
Wahyu Wibowo  RMIT University, Melbourne, Australia
Hugh E. Williams  RMIT University, Melbourne, Australia
Sponsors
SIGMIS: ACM Special Interest Group on Management Information Systems
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 35,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/584792.584878
What is a DOI?

ABSTRACT

On the Web, browsing and searching categories is a popular method of finding documents. Two well-known category-based search systems are the Yahoo!~and DMOZ hierarchies, which are maintained by experts who assign documents to categories. However, manual categorisation by experts is costly, subjective, and not scalable with the increasing volumes of data that must be processed. Several methods have been investigated for effective automatic text categorisation. These include selection of categorisation methods, selection of pre-categorised training samples, use of hierachies, and selection of document fragments or features. In this paper, we further investigate categorisation into Web hierarchies and the role of hierarchical information in improving categorisation effectiveness. We introduce new strategies to reduce errors in hierarchical categorisation. In particular, we propose novel techniques that shift the assignment into higher level categories when lower level assignment is uncertain. Our results show that absolute error rates can be reduced by over 2%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
S. D'Alessio, K. Murray, R. Schiaffino, and A. Kershenbaum. The effect of using hierarchical classifiers in text categorization. In Proceeding of RIAO-00, 6th International Conference "Recherche d'Information Assistee par Ordinateur", pages 302--313, Paris, FR, 2000.
 
4
S. D'Alessio, K. Murray, R. Schiaffino, and A. Kershenbaum. Category levels in hierarchical text categorization. In Proc. of EMNLP-98, 3rd Conference on Empirical Methods in Natural Language Processing, Granada, Spain, 1998. Association for Computational Linguistics, Morristown.
5
 
6
 
7
 
8
9
10
 
11
D. Mladenic and M. Grobelnik. Feature selection for classification based on text hierarchy. In Working notes of Learning from Text and the Web, Conference on Automated Learning and Discovery CONALD-98, Pittsburg, USA, 1998.
 
12
J.J. Rocchio. Relevance feedback in information retrieval. In The Smart Retrieval System --- Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall, Englewood, Cliffs, New Jersey, 1971.
13
 
14
 
15
V. Shanks and H.E. Williams. Fast categorisation of large document collections. In 8th International Symposium on String Processing and Information Retrieval (SPIRE2001), pages 194--204, San Rafael, Chile, 2001.
 
16
A.S. Weigend, E.D. Wiener, and J.O. Pedersen. Exploiting hierarchy in text categorization.
 
17
W. Wibowo and H.E. Williams. On using hierarchies for document classification. In Proc. Australian Document Computing Conference, pages 31--37, Coffs Harbour, Australia, 1999.
 
18
H.E. Williams and J. Zobel. Searchable words on the web. International Journal of Digital Libraries. To appear.
 
19


Collaborative Colleagues:
Wahyu Wibowo: colleagues
Hugh E. Williams: colleagues