| Structural proximity searching for large collections of semi-structured data |
| Full text |
Pdf
(1.92 MB)
|
| Source
|
Conference on Information and Knowledge Management
archive
Proceedings of the tenth international conference on Information and knowledge management
table of contents
Atlanta, Georgia, USA
Session: Semistructured Data
table of contents
Pages: 175 - 182
Year of Publication: 2001
ISBN:1-58113-436-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 37, Citation Count: 6
|
|
|
ABSTRACT
The richness of the XML data format allows data to be structured in a way which precisely captures the semantics required by the author. It is the structure of the data, however, which forms the basis of all XML query languages. Without at least some notion of the structure, a user cannot meaningfully query the data. This problem is compounded when one considers that heterogeneous data adhering to different schema are likely to exist in the database(s) being queried. This paper proposes a solution based on an efficient proximity index. In particular, we describe a family of encoding and compression schemes which enable us to build an index to efficiently implement the proximity search. Our index is extremely small, and can reflect updates in the underlying database in modest time. Experiments show that our algorithm and implementation are fast and scale well.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
T. Bray, J. Paoli, and C.M. Sperberg-McQueen. Extensible markup language (xml) 1.0. In W3C Recommendation, World Wide Web Consortium, 1998; available online at http://www.w3.org/TR/1998/REC-xml-19980210.
|
| |
3
|
E. W. Dijkstra. A note on two problems in connexions with graphs. Numerische Mathematik, 1:269-271, 1959.
|
| |
4
|
|
| |
5
|
|
| |
6
|
Y.Hayashi, J. Tomita, and G. Kikui. Searching Text-rich XML Documents with Relevance Ranking. In ACM SIGIR 2000 Workshop on XML and Information Retrieval, July, 2000.
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
The SODA Research Group. SODA2: The Semistructured Object Database System, Version 2. http://dba.cse.unsw.edu.au.
|
CITED BY 6
|
|
|
|
|
|
|
|
Sara Cohen , Jonathan Mamou , Yaron Kanza , Yehoshua Sagiv, XSEarch: a semantic search engine for XML, Proceedings of the 29th international conference on Very large data bases, p.45-56, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
|
|
|
|
|