ACM Home Page
Please provide us with feedback. Feedback
Discovering frequently changing structures from historical structural deltas of unordered XML
Full text PdfPdf (436 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the thirteenth ACM international conference on Information and knowledge management table of contents
Washington, D.C., USA
SESSION: DB-3 (databases): data mining table of contents
Pages: 188 - 197  
Year of Publication: 2004
ISBN:1-58113-874-1
Authors
Qiankun Zhao  Nanyang Technological University, Singapore
Sourav S. Bhowmick  Nanyang Technological University, Singapore
Mukesh Mohania  IBM India Research Lab, India
Yahiko Kambayashi  Kyoto University, Japan
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 36,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1031171.1031210
What is a DOI?

ABSTRACT

Recently, a large amount of work has been done in XML data mining. However, we observed that most of the existing works focus on the snapshot XML data, while XML data is dynamic in real applications. To the best of our knowledge, none of the existing works has addressed the issue of mining the history of changes to XML documents. Such mining results can be useful in many applications such as XML change detection, XML indexing, association rule mining, and classification etc. In this paper, we propose a novel approach to discover the <i>frequently changing structures</i> from the sequence of historical <i>structural deltas</i> of unordered XML. To make the structure discovering process efficient, an expressive and compact data model, <b>H</b>istorical-<b>D</b>ocument <b>O</b>bject <b>M</b>odel (<b>H-DOM</b>), is proposed. Using this model, two basic algorithms, which can discover all the <i>frequently changing structures</i> with only two scans of the XML sequence, are presented. Experimental results show that our algorithms, together with the optimization techniques, are efficient and scalable.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
L. Chen, S. S. Bhowmick and C. Chia. Mining Association Rules from Structural Deltas of Historical XML Documents. In In Proc. PAKDD, 452--457, 2004.
 
3
 
4
Curbera and D. A. Epstein. Fast difference and update of XML documents. In Proc. XTech'99, 1999.
5
 
6
 
7
 
8
Y. Wang, D. J. DeWitt, and J.-Y. Cai. X-diff: An effective change detection algorithm for XML documents. In Proc. ICDE, 519--530, 2003.
 
9
10
11
 
12
Q. Zhao and S. S. Bhowmick. Mining changes to historical web access patterns. In Pro. PKDD, 2004.


Collaborative Colleagues:
Qiankun Zhao: colleagues
Sourav S. Bhowmick: colleagues
Mukesh Mohania: colleagues
Yahiko Kambayashi: colleagues