ACM Home Page
Please provide us with feedback. Feedback
Improving the effectiveness of information retrieval with local context analysis
Full text PdfPdf (193 KB)
Source ACM Transactions on Information Systems (TOIS) archive
Volume 18 ,  Issue 1  (January 2000) table of contents
Pages: 79 - 112  
Year of Publication: 2000
ISSN:1046-8188
Authors
Jinxi Xu  BBN Technologies, Cambridge, MA
W. Bruce Croft  Univ. of Massachusetts–Amherst, Amherst
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 32,   Downloads (12 Months): 304,   Citation Count: 85
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/333135.333138
What is a DOI?

ABSTRACT

Techniques for automatic query expansion have been extensively studied in information research as a means of addressing the word mismatch between queries and documents. These techniques can be categorized as either global or local. While global techniques rely on analysis of a whole collection to discover word relationships, local techniques emphasize analysis of the top-ranked documents retrieved for a query. While local techniques have shown to be more effective that global techniques in general, existing local techniques are not robust and can seriously hurt retrieved when few of the retrieval documents are relevant. We propose a new technique, called local context analysis, which selects expansion terms based on cooccurrence with the query terms within the top-ranked documents. Experiments on a number of collections, both English and non-English, show that local context analysis offers more effective and consistent retrieval results.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
ALLAN, J., CALLAN, J., CROFT, W., BALLESTEROS, L., BYRD, D., SWAN, R., AND XU, J. 1998. INQUERY does battle with TREC-6. In Proceedings of the 6th Text Retrieval Conference (TREC-6), E. Voorhees, Ed. 169-206. NIST Special Publication 500-240.
2
3
 
4
BROGLIO, J., CALLAN, J. P., AND CROFT, W. 1994. An overview of the INQUERY system as used for the TIPSTER project. In Proceedings of the TIPSTER Workshop, Morgan Kaufmann, San Mateo, CA, 47-67.
 
5
BROGLIO, J., CALLAN, J. P., CROFT, W. B., AND NACHBAR, D.W. 1995. Document retrieval and routing using the INQUERY system. In Proceedings of the 3rd Text Retrieval Conference (TREC-3), D. Harman, Ed. National Institute of Standards and Technology, Gaithersburg, MD, 22-29.
 
6
BUCKLEY, C., MITRA, M., WALZ, J., AND CARDIE, C. 1998. Using clustering and superconcepts within SMART. In Proceedings of the 6th Text Retrieval Conference (TREC-6), E. Voorhees, Ed. 107-124. NIST Special Publication 500-240.
 
7
BUCKLEY, C., SALTON, G., ALAN, J., AND SINGHAL, A. 1995a. Automatic query expansion using SMART. In Proceedings of the 3rd Text Retrieval Conference (TREC-3), D. Harman, Ed. National Institute of Standards and Technology, Gaithersburg, MD, 69-80.
 
8
BUCKLEY, C., SINGHAL, A., MITRA, M., AND SALTON, G. 1995b. New retrieval approaches using SMART. In Proceedings of the 4th Text Retrieval Conference (TREC-4, Washington, D.C., Nov.), D. K. Harman, Ed. National Institute of Standards and Technology, Gaithersburg, MD, 25-48.
 
9
 
10
 
11
CROFT, W. AND HARPER, D.J. 1979. Using probabilistic models of document retrieval without relevance information. J. Doc. 35, 285-295.
 
12
CROFT, W. B., COOK, R., AND WILDER, D. 1995. Providing government information on the Internet: Experiences with THOMAS. In Proceedings of the 2nd International Conference on Theory and Practice of Digital Libraries (DL '95, Austin, TX, June), 19-24.
 
13
DEERWESTER, S., DUMAI, S. T., FURNAS, G. W., LANDAUER, T. K., AND HARSHMAN, R. 1990. Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 6, 391-407.
 
14
15
16
17
 
18
HAWKING, D., THISTLEWAITE, P., AND CRASWELL, N. 1998. ANU/ACSys TREC-6 experiments. In Proceedings of the 6th Text Retrieval Conference (TREC-6), E. Voorhees, Ed. 275-290. NIST Special Publication 500-240.
 
19
20
21
 
22
JING, Y. AND CROFT, W. B. 1994. An association thesaurus for information retrieval. In Proceedings of the Intelligent Multimedia Information Retrieval Systems (RIAO '94, New York, NY), 146-160.
23
 
24
KWOK, K. L., GRUNFELD, L., AND XU, J. 1998. TREC-6 English and Chinese experiments using PIRCS. In Proceedings of the 6th Text Retrieval Conference (TREC-6), E. Voorhees, Ed. 207-214. NIST Special Publication 500-240.
 
25
Lu, A., AYOUB, M., AND DONG, J. 1997. Ad hoc experiments using EUREKA. In Proceedings of the 5th Text Retrieval Conference, 229-240. NIST Special Pub 500-238.
 
26
MINKER, J., WILSON, G., AND ZIMMERMAN, B. 1972. An evaluation of query expansion by the addition of clustered terms for a document retrieval system. Inf. Storage Retrieval 8, 329-348.
27
28
 
29
PONTE, g. AND CROFT, B. 1996. USeg: A retargetable word segmentation procedure for information retrieval. In Proceedings of the Symposium on Document Analysis and Information Retrieval,
 
30
31
 
32
ROCCHIO, J. 1971. Relevance feedback in information retrieval. In The Smart Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Englewood Cliffs, NJ, 313-323.
 
33
 
34
SALTON, G. AND BUCKLEY, C. 1990. Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41, 4, 288-297.
 
35
SCH TZE, H. AND PEDERSEN, g. 1994. A cooccurrence-based thesaurus and two applications to information retrieval. In Proceedings of the Intelligent Multimedia Information Retrieval Systems (RIAO '94, New York, NY), 266-274.
36
 
37
SPARCK JONES, K. 1971. Automatic Keyword Classification for Information Retrieval. Butterworths, London, UK.
 
38
 
39
VOORHEES, E. AND HARMAN, D. 1998. Overview of the Sixth Text Retrieval Conference (TREC-6). In Proceedings of the 6th Text Retrieval Conference (TREC-6), E. Voorhees, Ed. 1-24. NIST Special Publication 500-240.
 
40
WALKER, S., ROBERTSON, S., BOUGHANEM, M., JONES, G., AND JONES, K. S. 1997. Okapi at TREC-6 automatic ad hoc, VLC, routing, filtering and QSDR. In Proceedings of the 6th Text Retreival Conference (TREC-6, Nov.), E. Voorhees and D. Harman, Eds. 125-136.
 
41
WILKINSON, R., ZOBEL, J., AND SACKS-DAVIS, R. 1996. Similarity measures for short queries. In Proceedings of the 4th Text Retrieval Conference, D. Harman, Ed. 277-286. NIST Special Publication 500-236.
 
42
43
44
 
45

CITED BY  85


REVIEW

"Karen Sparck-Jones : Reviewer"

This good, solid paper addresses the word mismatch problem (that is, different words for a single concept) with query expansion, using the local context supplied by top-ranked documents in a presearch to identify good term associations. This s  more...

Collaborative Colleagues:
Jinxi Xu: colleagues
W. Bruce Croft: colleagues