| A schema matching-based approach to XML schema clustering |
| Full text |
Pdf
(229 KB)
|
| Source
|
International Conference on Information Integration and web-based Applications and Services
archive
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
table of contents
Linz, Austria
SESSION: iiWAS 2008: XML data modelling and processing
table of contents
Pages 131-136
Year of Publication: 2008
ISBN:978-1-60558-349-5
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 92, Citation Count: 0
|
|
|
ABSTRACT
The relationship between XML data clustering and schema matching is bidirectional. On one side, clustering techniques have been adopted to improve matching performance, and on the other side schema matching is the backbone of the clustering technique. This paper presents a new approach for clustering XML schema based on schema matching. In particular, we develop and implement an XML schema matching system, which determines semantic similarities between XML schemas based on the Prüfer sequence representation of schema trees. The proposed computation similarity algorithm makes use of the semantic meaning of XML elements as well as the hierarchical features of XML schemas. The computed similarities are then exploited by an agglomerative clustering algorithm to group similar schemas. Our experimental results show that the proposed approach is fast and accurate in clustering heterogeneous XML schemas.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Charu C. Aggarwal , Na Ta , Jianyong Wang , Jianhua Feng , Mohammed Zaki, Xproj: a framework for projected structural clustering of xml documents, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281201]
|
| |
2
|
A. Algergawy, E. Schallehn, and G. Saake. A prufer sequence-based approach for schema matching. In BalticDB&IS2008. Estonia, 2008.
|
| |
3
|
P. Berkhin. Survey of clustering data mining techniques. In Accrue Software, Inc., pages 1--56, 2002.
|
| |
4
|
|
| |
5
|
|
| |
6
|
D. Carmel, N. Efraty, G. M. Landau, Y. S. Maarek, and Y. Mass. An extension of the vector space model for querying xml documents via XML fragments. SIGIR Forum, 36(2), 2002.
|
| |
7
|
|
| |
8
|
|
| |
9
|
G. Guerrini, M. Mesiti, and I. Sanz. An Overview of Similarity Measures for Clustering XML Documents. Web Data Management Practices: Emerging Techniques and Technologies. IDEA GROUP, 2007.
|
| |
10
|
M. Hassler and A. Bouchachia. Searching XML documents: Preliminary work. In INEX2005, pages 119--133, 2005.
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
Mong Li Lee , Liang Huai Yang , Wynne Hsu , Xia Yang, XClust: clustering XML schemas for effective integration, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
[doi> 10.1145/584792.584841]
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
J. Pei, J. Hong, and D. A. Bell. A novel clustering-based approach to schema matching. In 4th ADVIS, pages 60--69, 2006.
|
| |
19
|
|
| |
20
|
H. Prufer. Neuer beweis eines satzes uber permutationen. Archiv fur Mathematik und Physik, 27:142--144, 1918.
|
| |
21
|
|
| |
22
|
|
| |
23
|
H. Zhao and S. Ram. Clustering schema elements for semantic integration of heterogeneous data sources. Journal of Database Management, 15(4):88--106, 2004.
|
|