|
ABSTRACT
Forming test collection relevance judgments from the pooled output of multiple retrieval systems has become the standard process for creating resources such as the TREC, CLEF, and NTCIR test collections. This paper presents a series of experiments examining three different ways of building test collections where no system pooling is used. First, a collection formation technique combining manual feedback and multiple systems is adapted to work with a single retrieval system. Second, an existing method based on pooling the output of multiple manual searches is re-examined: testing a wider range of searchers and retrieval systems than has been examined before. Third, a new approach is explored where the ranked output of a single automatic search on a single retrieval system is assessed for relevance: no pooling whatsoever. Using established techniques for evaluating the quality of relevance judgments, in all three cases, test collections are formed that are as good as TREC.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bland, J. M., Altman, D. G.(1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, i, 307--310.
|
 |
2
|
|
| |
3
|
Christopher Cieri , Stephanie Strassel , David Graff , Nii Martey , Kara Rennert , Mark Liberman, Corpora for topic detection and tracking, Topic detection and tracking: event-based information organization, Kluwer Academic Publishers, Norwell, MA, 2002
|
 |
4
|
|
| |
5
|
Fox, E. A. and Shaw, J. A.(1993), Combination of Multiple Searches, in NIST Special Publication 500-215: The 2nd Text REtrieval Conference(TREC-2), Gaithersburg, MD, 243--252.
|
| |
6
|
Garofolo, J. S., Voorhees, E. M., Stanford, V. M., Spärck Jones, K.(1997), TREC-6 1997 Spoken Document Retrieval Track Overview and Results, in Proceedings of the 6th Text REtrieval Conference(TREC 6), NIST Special Publication 500-240, 83--92.
|
| |
7
|
Gilbert, H. and Spärck Jones, K.(1979), Statistical bases of relevance assessment for the 'ideal' information retrieval test collection, British Library Research and Development Report 5481, Computer Laboratory, University of Cambridge.
|
 |
8
|
|
 |
9
|
V. Harmandas , M. Sanderson , M. D. Dunlop, Image retrieval by hypertext links, Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval, p.296-303, July 27-31, 1997, Philadelphia, Pennsylvania, United States
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
Páraic Sheridan , Martin Wechsler , Peter Schäuble, Cross-language speech retrieval: establishing a baseline performance, Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval, p.99-108, July 27-31, 1997, Philadelphia, Pennsylvania, United States
|
 |
15
|
|
 |
16
|
|
| |
17
|
Spärck Jones, K.,(1974), Progress in Documentation: Automatic Indexing, Journal of Documentation, 30(4), 393--432.
|
| |
18
|
Spärck Jones, K., Van Rijsbergen, C. J.(1975), Report on the need for and provision of an 'ideal' information retrieval test collection, British Library Research and Development Report 5266, University Computer Laboratory, Cambridge.
|
| |
19
|
Spärck Jones, K., Bates, R. G.(1977), Report on a design study for the 'ideal' information retrieval test collection, British Library Research and Development Report 5428, Computer Laboratory, University of Cambridge.
|
| |
20
|
Stuart, A.(1983), Kendall's tau. In Kotz, S and Johnson, N. L., editors, Encyclopedia of Statistical Sciences, vol. 4, 367--369. John Wiley and Sons.
|
| |
21
|
Sullivan, D.(2002), The Search Engine "Perfect Page", in Search Engine Watch accessed from http://searchenginewatch.com/searchday/02/sd1104-pptest.html.
|
 |
22
|
|
| |
23
|
Voorhees, E. M., Harman, D.(1998) Overview of the 7 th Text REtrieval Conference(TREC-7), in Proceedings of the 7th Text REtrieval Conference(TREC-7) NIST Special Publication 500-242, 1--24.
|
| |
24
|
Voorhees, E. M., Harman, D.(1999) Overview of the 8th Text REtrieval Conference(TREC-8), in Proceedings of the 8th Text REtrieval Conference(TREC-8) NIST Special Publication 500-246, 1--24.
|
 |
25
|
|
| |
26
|
Voorhees, E.(2002), Personal Communication.
|
 |
27
|
|
CITED BY 16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
K. Y. Lin , S. H. Hsieh , H. P. Tserng , K. W. Chou , H. T. Lin , C. P. Huang , K. F. Tzeng, Enabling the creation of domain-specific reference collections to support text-based information retrieval experiments in the architecture, engineering and construction industries, Advanced Engineering Informatics, v.22 n.3, p.350-361, July, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrew Turpin , Falk Scholer , Kalvero Jarvelin , Mingfang Wu , J. Shane Culpepper, Including summaries in system evaluation, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|