ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Framework for mining web content outliers
Full text PdfPdf (138 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2004 ACM symposium on Applied computing table of contents
Nicosia, Cyprus
SESSION: Data mining (DM) table of contents
Pages: 590 - 594  
Year of Publication: 2004
ISBN:1-58113-812-1
Authors
Malik Agyemang  University of Calgary, Calgary, Alberta, Canada
Ken Barker  University of Calgary, Calgary, Alberta, Canada
Reda Alhajj  University of Calgary, Calgary, Alberta, Canada
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 68,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/967900.968022
What is a DOI?

ABSTRACT

Outliers are data objects with different characteristics compared to other data objects. Exploring the diverse and dynamic web data for outliers is more interesting than finding outliers in numeric data sets. Interestingly, the existing web mining algorithms have concentrated on finding patterns that are frequent while discarding the less frequent ones that are likely to contain the outlying data. This paper refers to outliers present on the web as web outliers to distinguish them from traditional outliers. Web outliers are data objects that show significantly different characteristics than other web data. Although the presence of web outliers appears obvious, there is neither formal definition for web outliers nor algorithms for mining them. Secondly, traditional outlier mining algorithms designed solely for numeric data sets are inappropriate for mining web outliers. This paper establishes the presence of web outliers and discusses some practical applications of web outlier mining. Finally, we present taxonomy for web outliers and propose a general framework for mining web content out.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Barnett, V. and Lewis, T. Outliers in Statistical Data. John Willey, 1994.
3
 
4
 
5
Cooley, R., Mobasher, B., and Srivastava, J. Data Preparation for Mining World Wide Web Browsing Patterns. Knowledge and Information Systems 1, 1999.
 
6
Cooley, R. Mobasher B. and Srivastava J. Web Mining: Information and Pattern Discovery on the Web. SIGKDD Exploration: Newsletter of SIGKDD and Data Mining, ACM, I(2), 2000
 
7
Danile Riboni. Feature Selection for Web Page Classification. D.S.I Universita, Milano, Italy, 2002
 
8
Hawkins, D. Identification of Outliers. Chapman and Hall, London, 1980.
9
 
10
Johnson, T., Kwok, I., and Ng, R. Fast Computation of 2-D Depth Contours. Proc. of KDD, 1998, pp 224--228.
 
11
Knorr, E. M., and Ng, R. T. A Unified Notion of Outliers: Properties and Computation. Proc. of KDD, 1997, pp 219--222.
 
12
13
14


Collaborative Colleagues:
Malik Agyemang: colleagues
Ken Barker: colleagues
Reda Alhajj: colleagues