ACM Home Page
Please provide us with feedback. Feedback
A schema matching-based approach to XML schema clustering
Full text PdfPdf (229 KB)
Source International Conference on Information Integration and web-based Applications and Services archive
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services table of contents
Linz, Austria
SESSION: iiWAS 2008: XML data modelling and processing table of contents
Pages 131-136  
Year of Publication: 2008
ISBN:978-1-60558-349-5
Authors
Alsayed Algergawy  Magdeburg University, Magdeburg, Germany
Eike Schallehn  Magdeburg University, Magdeburg, Germany
Gunter Saake  Magdeburg University, Magdeburg, Germany
Sponsor
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 87,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1497308.1497337
What is a DOI?

ABSTRACT

The relationship between XML data clustering and schema matching is bidirectional. On one side, clustering techniques have been adopted to improve matching performance, and on the other side schema matching is the backbone of the clustering technique. This paper presents a new approach for clustering XML schema based on schema matching. In particular, we develop and implement an XML schema matching system, which determines semantic similarities between XML schemas based on the Prüfer sequence representation of schema trees. The proposed computation similarity algorithm makes use of the semantic meaning of XML elements as well as the hierarchical features of XML schemas. The computed similarities are then exploited by an agglomerative clustering algorithm to group similar schemas. Our experimental results show that the proposed approach is fast and accurate in clustering heterogeneous XML schemas.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
A. Algergawy, E. Schallehn, and G. Saake. A prufer sequence-based approach for schema matching. In BalticDB&IS2008. Estonia, 2008.
 
3
P. Berkhin. Survey of clustering data mining techniques. In Accrue Software, Inc., pages 1--56, 2002.
 
4
 
5
 
6
D. Carmel, N. Efraty, G. M. Landau, Y. S. Maarek, and Y. Mass. An extension of the vector space model for querying xml documents via XML fragments. SIGIR Forum, 36(2), 2002.
 
7
 
8
 
9
G. Guerrini, M. Mesiti, and I. Sanz. An Overview of Similarity Measures for Clustering XML Documents. Web Data Management Practices: Emerging Techniques and Technologies. IDEA GROUP, 2007.
 
10
M. Hassler and A. Bouchachia. Searching XML documents: Preliminary work. In INEX2005, pages 119--133, 2005.
11
12
13
14
 
15
 
16
 
17
 
18
J. Pei, J. Hong, and D. A. Bell. A novel clustering-based approach to schema matching. In 4th ADVIS, pages 60--69, 2006.
 
19
 
20
H. Prufer. Neuer beweis eines satzes uber permutationen. Archiv fur Mathematik und Physik, 27:142--144, 1918.
 
21
 
22
 
23
H. Zhao and S. Ram. Clustering schema elements for semantic integration of heterogeneous data sources. Journal of Database Management, 15(4):88--106, 2004.

Collaborative Colleagues:
Alsayed Algergawy: colleagues
Eike Schallehn: colleagues
Gunter Saake: colleagues