ACM Home Page
Please provide us with feedback. Feedback
Structured queries in XML retrieval
Full text PdfPdf (260 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the 14th ACM international conference on Information and knowledge management table of contents
Bremen, Germany
SESSION: Paper session IR-1 (information retrieval): XML retrieval table of contents
Pages: 4 - 11  
Year of Publication: 2005
ISBN:1-59593-140-6
Authors
Jaap Kamps  University of Amsterdam, Amsterdam, The Netherlands
Maarten Marx  University of Amsterdam, Amsterdam, The Netherlands
Maarten de Rijke  University of Amsterdam, Amsterdam, The Netherlands
Börkur Sigurbjörnsson  University of Amsterdam, Amsterdam, The Netherlands
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 67,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1099554.1099559
What is a DOI?

ABSTRACT

Document-centric XML is a mixture of text and structure. With the increased availability of document-centric XML content comes a need for query facilities in which both structural constraints and constraints on the content of the documents can be expressed. How does the expressiveness of languages for querying XML documents help users to express their information needs? We address this question from both an experimental and a theoretical point of view. Our experimental analysis compares a structure-ignorant with a structure-aware retrieval approach using the test-suite of the 2004 edition of the INEX XML retrieval evaluation initiative. Theoretically, we create mathematical models of users' knowledge of a set of documents and define query languages which exactly fit these models. One of these languages corresponds to an XML version of fielded search, the other to the INEX query language. Our main findings are: First, while structure is used in varying degrees of complexity, over half of the queries can be expressed in a fielded-search like format which does not use the hierarchical structure of the documents. Second, structure is used as a search hint, and not a strict requirement, when judged against the underlying information need. Third, the use of structure in queries functions as a precision enhancing device.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
D. Carmel, Y. S. Maarek, Y. Mass, N. Efraty, and G. M. Landau. An extension of the vector space model for querying XML documents via XML fragments. In Proceedings SIGIR 2002 Workshop on XML and Information Retrieval, pages 14--25, 2002.
 
5
 
6
N. Fuhr, M. Lalmas, S. Malik, and Z. Szlávik, editors. INEX 2004 Workshop Pre-Proceedings, 2004.
 
7
G. Gottlob, C. Koch, and R. Pichler. Efficient algorithms for processing XPath queries. In VLDB'02, 2002.
 
8
 
9
INEX. INitiative for the Evaluation of XML Retrieval, 2004. http://inex.is.informatik.uni-duisburg.de:2004/.
 
10
G. Kazai, M. Lalmas, and B. Piwowarski. INEX 2004 relevance assessment guide. In Fuhr et al. {6}, pages 241--248.
 
11
12
 
13
W. May. Information extraction and integration with Florid: The Mondial case study. Technical report, Universität Freiburg, Institut für Informatik, 1999.
 
14
M. Mitra, C. Buckley, A. Singhal, and C. Cardie. An analysis of statistical and syntactic phrases. In Proc. RIAO-97, 1997.
 
15
R. A. O'Keefe and A. Trotman. The simplest query language that could possibly work. In Proceedings of the 2nd INEX Workshop, 2004.
 
16
J. Ponte. Language models for relevance feedback. In W. Croft, editor, Advances in Information Retrieval, chapter 3, pages 73--96. Kluwer, 2000.
 
17
Y. Rasolofo and J. Savoy. Term proximity scoring for keyword-based retrieval systems. In Proc. ECIR 2003), pages 207--218, 2003.
 
18
B. Sigurbjörnsson, J. Kamps, and M. de Rijke. The University of Amsterdam at INEX 2004. In Fuhr et al. {6}, pages 104--109.
19
 
20
B. Sigurbjörnsson, B. Larsen, M. Lalmas, and S. Maalik. INEX04 guidelines for topic development. In Fuhr et al. {6}, pages 219--236.
 
21
B. Sigurbjörnsson and A. Trotman. Queries, INEX 2003 working group report. In Proceedings of the 2nd INEX Workshop, 2004.
 
22
A. Tombros, B. Larsen, and S. Malik. The interactive track at INEX 2004. In Fuhr et al. {6}, pages 24--29.
 
23
A. Trotman and B. Sigurbjörnsson. Narrowed Extended XPath I (NEXI). In Fuhr et al. {6}, pages 219--236.
24
 
25
S. Wasserman and K. Faust. Social Network Analysis. Cambridge University Press, 1994.


Collaborative Colleagues:
Jaap Kamps: colleagues
Maarten Marx: colleagues
Maarten de Rijke: colleagues
Börkur Sigurbjörnsson: colleagues