ACM Home Page
Please provide us with feedback. Feedback
Structural proximity searching for large collections of semi-structured data
Full text PdfPdf (1.92 MB)
Source Conference on Information and Knowledge Management archive
Proceedings of the tenth international conference on Information and knowledge management table of contents
Atlanta, Georgia, USA
Session: Semistructured Data table of contents
Pages: 175 - 182  
Year of Publication: 2001
ISBN:1-58113-436-3
Authors
Michael Barg  University of New South Wales, Sydney, Australia
Raymond K. Wong  University of New South Wales, Sydney, Australia
Sponsors
SIGMIS: ACM Special Interest Group on Management Information Systems
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 37,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502585.502615
What is a DOI?

ABSTRACT

The richness of the XML data format allows data to be structured in a way which precisely captures the semantics required by the author. It is the structure of the data, however, which forms the basis of all XML query languages. Without at least some notion of the structure, a user cannot meaningfully query the data. This problem is compounded when one considers that heterogeneous data adhering to different schema are likely to exist in the database(s) being queried. This paper proposes a solution based on an efficient proximity index. In particular, we describe a family of encoding and compression schemes which enable us to build an index to efficiently implement the proximity search. Our index is extremely small, and can reflect updates in the underlying database in modest time. Experiments show that our algorithm and implementation are fast and scale well.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
T. Bray, J. Paoli, and C.M. Sperberg-McQueen. Extensible markup language (xml) 1.0. In W3C Recommendation, World Wide Web Consortium, 1998; available online at http://www.w3.org/TR/1998/REC-xml-19980210.
 
3
E. W. Dijkstra. A note on two problems in connexions with graphs. Numerische Mathematik, 1:269-271, 1959.
 
4
 
5
 
6
Y.Hayashi, J. Tomita, and G. Kikui. Searching Text-rich XML Documents with Relevance Ranking. In ACM SIGIR 2000 Workshop on XML and Information Retrieval, July, 2000.
 
7
8
9
 
10
The SODA Research Group. SODA2: The Semistructured Object Database System, Version 2. http://dba.cse.unsw.edu.au.


Collaborative Colleagues:
Michael Barg: colleagues
Raymond K. Wong: colleagues