| XEdge: clustering homogeneous and heterogeneous XML documents using edge summaries |
| Full text |
Pdf
(181 KB)
|
| Source
|
Symposium on Applied Computing
archive
Proceedings of the 2008 ACM symposium on Applied computing
table of contents
Fortaleza, Ceara, Brazil
SESSION: Information access and retrieval
table of contents
Pages 1081-1088
Year of Publication: 2008
ISBN:978-1-59593-753-7
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 46, Citation Count: 0
|
|
|
ABSTRACT
In this paper we propose a unified clustering algorithm for both homogeneous and heterogeneous XML documents. Depending on the type of the XML documents, the proposed algorithm modifies its distance metric in order to properly adapt to the special structural characteristics of homogeneous and heterogeneous XML documents. We compare the quality of the formed clusters with those of one of the latest XML clustering algorithms and show that our algorithm outperforms it in the case of both homogeneous and heterogeneous XML documents.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Abiteboul, S., Buneman, P. and Suciu, D. Data on the Web. Morgan Kaufmann, 2000.
|
| |
2
|
|
| |
3
|
Gianni Costa , Giuseppe Manco , Riccardo Ortale , Andrea Tagarelli, A tree-based approach to clustering XML documents by structure, Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, p.137-148, September 20-24, 2004, Pisa, Italy
|
| |
4
|
|
| |
5
|
Doucet, A. and Ahonen-Myka, H. Naïve Clustering of a large XML Document Collection. In Proceedings of the 2002 Initiative for the Evaluation of XML Retrieval Workshop (INEX '02), 2002, pp. 81--87.
|
| |
6
|
|
| |
7
|
|
| |
8
|
Nayak, R. and Xu, S. XCLS: A Fast and Effective Clustering Algorithm for Heterogeneous XML Documents. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD '06) (The Singapore, April 9--12, 2006). 2006, pp. 292--302.
|
| |
9
|
Tagarelli, A. and Greco, S. Toward Semantic XML Clustering. In Proceedings of the 2006 Siam Conference on Data Mining (SDM '06) (Maryland, USA, 2006). 2006, pp. 188--199.
|
|