ACM Home Page
Please provide us with feedback. Feedback
Incremental updates of inverted lists for text document retrieval
Full text PdfPdf (1.39 MB)
Source International Conference on Management of Data archive
Proceedings of the 1994 ACM SIGMOD international conference on Management of data table of contents
Minneapolis, Minnesota, United States
Pages: 289 - 300  
Year of Publication: 1994
ISBN:0-89791-639-5
Also published in ...
Authors
Anthony Tomasic  Stanford University, Department of Computer Science, Stanford, CA
Héctor García-Molina  Stanford University, Department of Computer Science, Stanford, CA
Kurt Shoens  IBM Almaden Research Center, 650 Harry Road San Jose, CA
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 71,   Citation Count: 45
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/191839.191896
What is a DOI?

ABSTRACT

With the proliferation of the world's “information highways” a renewed interest in efficient document indexing techniques has come about. In this paper, the problem of incremental updates of inverted lists is addressed using a new dual-structure index. The index dynamically separates long and short inverted lists and optimizes retrieval, update, and storage of each type of list. To study the behavior of the index, a space of engineering trade-offs which range from optimizing update time to optimizing query performance is described. We quantitatively explore this space by using actual data and hardware in combination with a simulation of an information retrieval system. We then describe the best algorithm for a variety of criteria.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Samuel DeFazio. Full-text document retrieval benchmark. In Jim Gray, editor, The Benchmark Handbook }or Database and Transaction Processsng Systems, cha.pter 8. Morgan Kaufmann, second edition, 1993.
 
3
 
4
 
5
 
6
Donna Harman and Gerald Candela. Retrieving records from a gigabyte of text on a minicomputer using statistical ranking. Journal of the American Society }or Information Science, 41(8):581-589, 1990.
 
7
Donald E. Knuth. The Art of Computer Programmzng. Addison-Wesley, Reading, Massachusetts. 1973.
 
8
 
9
 
10
 
11
 
12
George Kingsley Zipf. Human Behavior and the Prznciple of Least Effort. Addison-Wesley Press, 1949.
 
13

CITED BY  45

Collaborative Colleagues:
Anthony Tomasic: colleagues
Héctor García-Molina: colleagues
Kurt Shoens: colleagues