ACM Home Page
Please provide us with feedback. Feedback
Building a filtering test collection for TREC 2002
Full text PdfPdf (685 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Filtering and retrieval models table of contents
Pages: 243 - 250  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
Ian Soboroff  National Institute of Standards and Technology, Gaithersburg, MD
Stephen Robertson  Microsoft Research, Cambridge, UK
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 50,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860481
What is a DOI?

ABSTRACT

Test collections for the filtering track in TREC have typically used either past sets of relevance judgments, or categorized collections such as Reuters Corpus Volume 1 or OHSUMED, because filtering systems need relevance judgments during the experiment for training and adaptation. For TREC 2002, we constructed an entirely new set of search topics for the Reuters Corpus for measuring filtering systems. Our method for building the topics involved multiple iterations of feedback from assessors, and fusion of results from multiple search systems using different search algorithms. We also developed a second set of "inexpensive" topics based on categories in the document collection. We found that the initial judgments made for the experiment were sufficient; subsequent pooled judging changed system rankings very little. We also found that systems performed very differently on the category topics than on the assessor-built topics.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Edward A. Fox and Joseph A. Shaw. Combination of Multiple Searches. In D. K. Harman, editor, Proc. of the Second Text REtrieval Conference (TREC-2), NIST SP 500-215, pp. 243--252. National Institute of Standards and Technology, Gaithersburg, MD, Nov 1993.
 
2
Donna Harman. Overview of the Third Text REtrieval Conference (TREC-3). In Donna K. Harman, editor, Proc. of the Third Text REtrieval Conference (TREC-3), NIST SP 500-225, pp. 1--20. Gaithersburg, MD, Nov 1994.
 
3
 
4
David A. Hull. The TREC-6 Filtering Track: Description and Analysis. In E. M. Voorhees and D. K. Harman, editors, Proc. of the Sixth Text REtrieval Conference (TREC-6), NIST SP 500-240. National Institute of Standards and Technology, Gaithersburg, MD, Nov 1998.
 
5
David A. Hull. The TREC-7 Filtering Track: Description and Analysis. In Voorhees and Harman {18}, pp. 33--56.
 
6
David A. Hull and Stephen Robertson. The TREC-8 Filtering Track Final Report. In Voorhees and Harman {19}, pp. 35--56.
7
 
8
David D. Lewis. The TREC-4 Filtering Track. In Donna K. Harman, editor, Proc. of the Fourth Text REtrieval Conference (TREC-4), NIST SP 500-236, pp. 165--180. Gaithersburg, MD, Nov 1995.
 
9
David D. Lewis. The TREC-5 Filtering Track. In Voorhees and Harman {17}, pp. 75--96.
 
10
 
11
Andrew K. McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering, 1996. http://www.cs.cmu.edu/~mccallum/bow.
 
12
Stephen Robertson and David A. Hull. The TREC-9 Filtering Track Final Report. In E. M. Voorhees and D. K. Harman, editors, Proc. of the Ninth Text REtrieval Conference (TREC-9), NIST SP 500-249, pp. 29--40. National Institute of Standards and Technology, Gaithersburg, MD, Nov 2000.
 
13
Stephen Robertson and Ian Soboroff. The TREC 2001 Filtering Track Final Report. In E. M. Voorhees and D. K. Harman, editors, Proc. of the Tenth Text REtrieval Conference (TREC 2001), NIST SP 500-250, pp. 26--37. Gaithersburg, MD, Nov 2001.
 
14
Stephen Robertson and Ian Soboroff. The TREC 2002 Filtering Track Final Report. In E. M. Voorhees and D. K. Harman, editors, Proc. of the Eleventh Text REtrieval Conference (TREC 2002), NIST SP 500-xxx. Gaithersburg, MD, Nov 2002. To appear.
 
15
T. G. Rose, M. Stevenson, and M. Whitehead. The Reuters Corpus Volume 1 - from Yesterday's News to Tomorrow's Language Resources. In In Proc. of the Third International Conference on Language Resources and Evaluation. Las Palmas de Gran Canaria, May 2002.
 
16
 
17
E. M. Voorhees and D. K. Harman, editors. Proc. of the Fifth Text REtrieval Conference (TREC-5), NIST SP 500-238. Gaithersburg, MD, Nov 1996.
 
18
E. M. Voorhees and D. K. Harman, editors. Proc. of the Seventh Text REtrieval Conference (TREC-7), NIST SP 500-242. National Institute of Standards and Technology, Gaithersburg, MD, Nov 1998.
 
19
E. M. Voorhees and D. K. Harman, editors. Proc. of the Eighth Text REtrieval Conference (TREC-8), NIST SP 500-246. National Institute of Standards and Technology, Gaithersburg, MD, Nov 1999.
 
20
Ellen M. Voorhees and Donna Harman. Overview of the Fifth Text RErieval Conference (TREC-5). In Voorhees and Harman {17}, pp. 1--28.
 
21
Ellen M. Voorhees and Donna Harman. Overview of the Eighth Text REtrieval Conference (TREC-8). In Voorhees and Harman {19}, pp. 1--24.
22

CITED BY  8

Collaborative Colleagues:
Ian Soboroff: colleagues
Stephen Robertson: colleagues