ACM Home Page
Please provide us with feedback. Feedback
Combining document representations for known-item search
Full text PdfPdf (201 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Structured documents table of contents
Pages: 143 - 150  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
Paul Ogilvie  Carnegie Mellon University, Pittsburgh, PA
Jamie Callan  Carnegie Mellon University, Pittsburgh, PA
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 98,   Citation Count: 27
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860463
What is a DOI?

ABSTRACT

This paper investigates the pre-conditions for successful combination of document representations formed from structural markup for the task of known-item search. As this task is very similar to work in meta-search and data fusion, we adapt several hypotheses from those research areas and investigate them in this context. To investigate these hypotheses, we present a mixture-based language model and also examine many of the current meta-search algorithms. We find that compatible output from systems is important for successful combination of document representations. We also demonstrate that combining low performing document representations can improve performance, but not consistently. We find that the techniques best suited for this task are robust to the inclusion of poorly performing document representations. We also explore the role of variance of results across systems and its impact on the performance of fusion, with the surprising result that the correct documents have higher variance across document representations than highly ranking incorrect documents.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
K. Collins-Thompson, P. Ogilvie, Y. Zhang, and J. Callan. Information filtering, novelty detection, and named-page finding. In Proceedings of the 11th Text REtrieval Conference (TREC-11), pages 338-349, notebook version, 2002.
4
 
5
W.B. Croft. Combining approaches to information retrieval. In Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, chapter 1, pages 1-36. Kluwer Academic Publishers, 2000.
 
6
E.A. Fox and J.A. Shaw. Combination of multiple searches. In The Second Text REtrieval Conference (TREC-2), pages 243-249, 1994.
 
7
N. Fuhr, N. Govert, G. Kazai, and M. Lalmas, editors. INEX 2002 Workshop Proceedings. To be published. Draft available at http://qmir.dcs.qmul.ac.uk/inex/Workshop.html.
 
8
D. Hawking and N. Craswell. Overview of the TREC-2001 Web Track. In Proceedings of the 10th Text REtrieval Conference (TREC-10), pages 61-67, 2002.
9
 
10
The Lemur toolkit for language modeling in information retrieval. http://www.cs.cmu.edu/~lemur
11
12
13
 
14
K.B. Ng and P. Kantor. An investigation of the preconditions for effective data fusion in IR: a pilot study. In Proc. of the 61st Annual Meeting of the American Society for Information Science, 1998.
 
15
E.K. Park, S.I. Moon, D.Y. Ra, and M.G. Jang. Web Document Retrieval Using Sentence-query Similarity. In Proceedings of the 11th Text REtrieval Conference (TREC-11), notebook version, 2002.
16
 
17
 
18
J. Savoy, A.L. Calve, and D. Vrajitoru. Report on the TREC-5 experiment: data fusion and collection fusion. In The 5th Text REtrieval Conference (TREC-5), pages 489-502, 1997.
 
19
20
21
 
22
M. Zhang, R. Song, C. Lin, L. Ma, Z. Jiang, Y. Jin, Y. Liu, L. Zhao, and S. Ma. THU at TREC 2002: novelty, web, and filtering (draft). In Proceedings of the 11th Text REtrieval Conference (TREC-11), pages 29-42, notebook version, 2002.

CITED BY  27

Collaborative Colleagues:
Paul Ogilvie: colleagues
Jamie Callan: colleagues