| Top-k generation of integrated schemas based on directed and weighted correspondences |
| Full text |
Pdf
(1.58 MB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 35th SIGMOD international conference on Management of data
table of contents
Providence, Rhode Island, USA
SESSION: Research session 17: data integration
table of contents
Pages 641-654
Year of Publication: 2009
ISBN:978-1-60558-551-2
|
|
Authors
|
|
Ahmed Radwan
|
University of Miami, Miami, FL, USA
|
|
Lucian Popa
|
IBM Almaden Research Center, San Jose, CA, USA
|
|
Ioana R. Stanoi
|
IBM Almaden Research Center, San Jose, CA, USA
|
|
Akmal Younis
|
University of Miami, Miami, FL, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 54, Downloads (12 Months): 203, Citation Count: 0
|
|
|
ABSTRACT
Schema integration is the problem of creating a unified target schema based on a set of existing source schemas and based on a set of correspondences that are the result of matching the source schemas. Previous methods for schema integration rely on the exploration, implicit or explicit, of the multiple design choices that are possible for the integrated schema. Such exploration relies heavily on user interaction; thus, it is time consuming and labor intensive. Furthermore, previous methods have ignored the additional information that typically results from the schema matching process, that is, the weights and in some cases the directions that are associated with the correspondences. In this paper, we propose a more automatic approach to schema integration that is based on the use of directed and weighted correspondences between the concepts that appear in the source schemas. A key component of our approach is a novel top-k ranking algorithm for the automatic generation of the best candidate schemas. The algorithm gives more weight to schemas that combine the concepts with higher similarity or coverage. Thus, the algorithm makes certain decisions that otherwise would likely be taken by a human expert. We show that the algorithm runs in polynomial time and moreover has good performance in practice.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
P. Brown, P. J. Haas, J. Myllymaki, H. Pirahesh, B. Reinwald, and Y. Sismanis. Toward Automated Large-Scale Information Integration and Discovery. In Data Management in a Connected World, pages 161--180, 2005.
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
M.-P. Dubuisson and A. K. Jain. A Modified Hausdorff Distance for Object Matching. In Proc. Int. Conf. on Pattern Recognition, pages 566--568, 1994.
|
| |
7
|
A. Gal. Managing Uncertainty in Schema Matching with Top-K Schema Mappings. J. Data Semantics, 6:90--114, 2006.
|
| |
8
|
H. Hamacher and M. Queyranne. K-best solutions to combinatorial optimization problems. Annals of Operations Research, 4:123--143, 1985/6.
|
| |
9
|
|
| |
10
|
|
| |
11
|
R. J. Miller, D. Fisla, M. Huang, D. Kymlicka, F. Ku, and V. Lee. The Amalgam schema and data integration test suite. www.cs.toronto.edu/ miller/amalgam, 2001.
|
| |
12
|
|
| |
13
|
www.dbis.informatik.uni-goettingen.de/Mondial.
|
| |
14
|
J. Munkres. Algorithms for the Assignment and Transportation Problems. Journal of the Society of Industrial and Applied Mathematics, 5(1):32--38, 1957.
|
| |
15
|
J. R. Munkres. Topology. Prentice Hall, Inc., 2000.
|
| |
16
|
K. G. Murty. An algorithm for ranking all the assignments in order of increasing cost. Operations Research, 16:682--687, 1968.
|
| |
17
|
|
| |
18
|
Lucian Popa , Yannis Velegrakis , Mauricio A. Hernández , Renée J. Miller , Ronald Fagin, Translating web data, Proceedings of the 28th international conference on Very Large Data Bases, p.598-609, August 20-23, 2002, Hong Kong, China
|
| |
19
|
|
 |
20
|
|
| |
21
|
A. Radwan, A. Younis, M. A. Hernández, H. Ho, L. Popa, S. Shivaji, and S. Khuri. BioFederator: A Data Federation System for Bioinformatics on the Web. In IIWeb Workshop, pages 92--97, 2007.
|
| |
22
|
|
| |
23
|
|
| |
24
|
G. Stumme and A. Maedche. FCA-MERGE: Bottom-up merging of ontologies. In IJCAI, pages 225--234, 2001.
|
 |
25
|
|
|