ACM Home Page
Please provide us with feedback. Feedback
Searching XML documents via XML fragments
Full text PdfPdf (402 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Structured documents table of contents
Pages: 151 - 158  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
David Carmel  IBM Research Lab in Haifa, Mount Carmel, Haifa
Yoelle S. Maarek  IBM Research Lab in Haifa, Mount Carmel, Haifa
Matan Mandelbrod  IBM Research Lab in Haifa, Mount Carmel, Haifa
Yosi Mass  IBM Research Lab in Haifa, Mount Carmel, Haifa
Aya Soffer  IBM Research Lab in Haifa, Mount Carmel, Haifa
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 152,   Citation Count: 43
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860464
What is a DOI?

ABSTRACT

Most of the work on XML query and search has stemmed from the publishing and database communities, mostly for the needs of business applications. Recently, the Information Retrieval community began investigating the XML search issue to answer information discovery needs. Following this trend, we present here an approach where information needs can be expressed in an approximate manner as pieces of XML documents or "XML fragments" of the same nature as the documents that are being searched. We present an extension of the vector space model for searching XML collections via XML fragments and ranking results by relevance. We describe how we have extended a full-text search engine to comply with this model. The value of the proposed method is demonstrated by the relative high precision of our system, which was among the top performers in the recent INEX workshop. Our results indicate that certain queries are more appropriate than others for the extended vector space model. Specifically, queries with relatively specific contexts but vague information needs are best situated to reap the benefit of this model. Finally our results show that one method may not fit all types of queries and that it could be worthwhile to use different solutions for different applications.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
R. Baeza-Yates, D. Carmel, Y. Maarek and A. Soffer (eds), JASIST Special Issue on XML and Information Retrieval, 53: 6, 2002
 
3
R. Baeza-Yates, N. Fuhr and Y. Maarek, Second Edition of the XML and IR Workshop, In SIGIR Forum, Volume 36 Number 2, Fall 2002
4
 
5
D. Carmel, E. Amitay, M. Herscovici, Y. Maarek, Y. Petruschka and A. Soffer, "Juru at TREC 10 - Experiments with Index Pruning", in Proceedings of NIST TREC 10, Nov 2001.
 
6
D. Carmel, N. Efraty, G. Landau, Y. Maarek, and Y. Mass, "An Extension of the Vector Space Model for Querying XML Documents via XML Fragments", in {3}.
7
 
8
D. Chamberlin, P. Fankhauser, M. Marchiori and J. Robie, XML Query Use Cases, W3C Working Draft 20 Dec 2001, http://www.w3.org/TR/2001/WD-xmlquery-use-cases-20011220
9
 
10
N. Fuhr and K. GrossJohann, "Query Formulation and Results Visualization for XML Retrieval", in {3}.
 
11
T. Grabs and H. J. Schek, "Generating Vector Spaces On-the-fly for Flexible XML Retrieval", in {3}.
 
12
INEX evaluation software, downloadable from http://ls6-www.cs.uni-dortmund.de/ir/projects/inex/download
 
13
Initiative for the evaluation of XML retrieval http://qmir.dcs.qmul.ac.uk/INEX/
 
14
15
16
 
17
XQuery, the XML Query language, http://www.w3.org/TR/2002/WD-xquery-20020430

CITED BY  43

Collaborative Colleagues:
David Carmel: colleagues
Yoelle S. Maarek: colleagues
Matan Mandelbrod: colleagues
Yosi Mass: colleagues
Aya Soffer: colleagues