|
ABSTRACT
A simple algorithm is presented for increasing the efficiency of information retrieval searches which are implemented using inverted files. This optimization algorithm employs knowledge about the methods used for weighting document and query terms in order to examine as few inverted lists as possible. An extension to the basic algorithm allows greatly increased performance optimization at a modest cost in retrieval effectiveness. Experimental runs are made examining several different term weighting models and showing the optimization possible with each.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
G. Salton, ed., The SMART Retrieval System. Prentice-Hall, Englewood Cliffs, N.J. (1971).
|
| |
2
|
R.E. Williamson, "Real-time Document Retrieval". Ph.D. Thesis, Cornell University (1971).
|
| |
3
|
N. Jardine and C.J. van Rijsbergen, "The Use of Hierarchic Clustering in Information Retrieval". Inform. Stor. Retr. 1971, 7, 217- 240.
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
F. Murtagh, "A Very Fast Exact Nearest Neighbor Algorithm for Use in Information Retrieval". Information Technology: Research and Development, 1, 1982, 275-283.
|
| |
9
|
M.J. McGill, T. Noreault, "Syracuse Information Retrieval Experiment (SIRE): Rationale and Basic System Design". Report, School of Information Studies, Syraeue University, May 1977.
|
 |
10
|
|
| |
11
|
T.E. Doszkocs, B.A. Rapp, "Searching MED- LINE in English: A Prototype User Interface with Natural Language Query, Ranked Output and Relevance Feedback", Proceedings of the ASIS Annual Meeting, 18, 1979, 131-139.
|
| |
12
|
D.J. Harper, "Relevance Feedback in Document Retrieval Systems: An Evaluation of Probabilistie Strategies". Ph.D. Thesis, The University of Cambridge (1980}.
|
| |
13
|
W.B. Croft, "Document Representation in Probabilistic Models of Information Retrieval". Journal of the American Society for Information Science, 32, 451-457.
|
| |
14
|
W.B. Croft, "Experiments with Representation in a Document Retrieval System". Information Technology: Research and Development, 2, 1983, 1-21.
|
| |
15
|
G. Salton, C.S. Yang, C.T. Yu, "A Theory of Term Importance in Automatic Text Analysis". Journal of the American Society for Information Science, 26, 1975, 33-44.
|
| |
16
|
E.A. Fox, "Characteristics of Two New Experimental Collections in Computer and Information Science Containing Textual and Bibliographic Concepts". Technical Report 83-561, Cornell University, 1983.
|
| |
17
|
C.A. Buckley, "An Overview of the Implementation of SMART". Technical Report, Cornell University, 1985.
|
| |
18
|
H. Wu and G. Salton, "The Estimation of Term Relevance Weights Using Relevance Feedback". Journal of Documentation, 37, 1981, 194-214.
|
CITED BY 54
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David Carmel , Doron Cohen , Ronald Fagin , Eitan Farchi , Michael Herscovici , Yoelle S. Maarek , Aya Soffer, Static index pruning for information retrieval systems, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.43-50, September 2001, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
Henk Ernst Blok , Djoerd Hiemstra , Sunil Choenni , Franciska de Jong , Henk M. Blanken , Peter M.G. Apers, Predicting the cost-quality trade-off for information retrieval queries: facilitating database design and query optimization, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Douglass R. Cutting , David R. Karger , Jan O. Pedersen , John W. Tukey, Scatter/Gather: a cluster-based approach to browsing large document collections, Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, p.318-329, June 21-24, 1992, Copenhagen, Denmark
|
|
|
|
|
|
Andrei Z. Broder , David Carmel , Michael Herscovici , Aya Soffer , Jason Zien, Efficient query evaluation using a two-level retrieval process, Proceedings of the twelfth international conference on Information and knowledge management, November 03-08, 2003, New Orleans, LA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ricardo Baeza-Yates , Aristides Gionis , Flavio Junqueira , Vanessa Murdock , Vassilis Plachouras , Fabrizio Silvestri, The impact of caching on search engines, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
Flavio Chierichetti , Alessandro Panconesi , Prabhakar Raghavan , Mauro Sozio , Alessandro Tiberi , Eli Upfal, Finding near neighbors through cluster pruning, Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 11-13, 2007, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
Ricardo Baeza-Yates , Aristides Gionis , Flavio P. Junqueira , Vanessa Murdock , Vassilis Plachouras , Fabrizio Silvestri, Design trade-offs for search engine caching, ACM Transactions on the Web (TWEB), v.2 n.4, p.1-28, October 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|