ACM Home Page
Please provide us with feedback. Feedback
XRules: an effective structural classifier for XML data
Full text PdfPdf (229 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Washington, D.C.
SESSION: Research track table of contents
Pages: 316 - 325  
Year of Publication: 2003
ISBN:1-58113-737-0
Authors
Mohammed J. Zaki  Rensselaer Polytechnic Institute
Charu C. Aggarwal  IBM T. J. Watson Research Center
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 72,   Citation Count: 30
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/956750.956787
What is a DOI?

ABSTRACT

XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification is an important problem in the data mining domain, but current classification methods for XML documents use IR-based methods in which each document is treated as a bag of words. Such techniques ignore a significant amount of information hidden inside the documents. In this paper we discuss the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents. Such a technique is more capable of finding the classification characteristics of documents. In addition, the technique can also be extended to cost sensitive classification. We show the effectiveness of the method with respect to other classifiers. We note that the methodology discussed in this paper is applicable to any kind of semi-structured data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
C. Aggarwal, S. Gates, P. Yu. On the merits of using supervised clustering to build categorization systems. SIGKDD, 1999.
 
3
 
4
K. Alsabti, S. Ranka, V. Singh. CLOUDS: A Decision Tree Classifier for Large Datasets. SIGKDD, 1998.
 
5
R. Andersen et al. Professional XML. Wrox Press Ltd, 2002.
 
6
T. Asai, et al. Efficient substructure discovery from large semi-structured data. 2nd SIAM Int'l Conference on Data Mining, 2002.
 
7
W. W. Cohen. Fast Effective Rule Induction. Int'l Conf. Machine Learning, 1995.
8
 
9
 
10
R. Duda, P. Hart. Pattern Classification and Scene Analysis, Wiley, New York, 1973.
11
 
12
 
13
 
14
 
15
 
16
B. Liu, W. Hsu, Y. Ma. Integrating Classification and Association Rule Mining. SIGKDD, 1998.
 
17
 
18
 
19
20
21

CITED BY  30

Collaborative Colleagues:
Mohammed J. Zaki: colleagues
Charu C. Aggarwal: colleagues