ACM Home Page
Please provide us with feedback. Feedback
Compression of concordances in full-text retrieval systems
Full text PdfPdf (1.36 MB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Grenoble, France
Pages: 597 - 612  
Year of Publication: 1988
ISBN:2-7061-0309-4
Authors
Y. Choueka  Dept. of Math. and Computer Science, Bar-Ilan University, Ramat Gan, Israel
A. S. Fraenkel  Dept. of Appl. Math. and Comp. Sc., Weizmann Institute of Science, Rehovot, Israel
S. T. Klein  Graduate Library School and Comp. Sc. Dept., University of Chicago, IL
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 25,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/62437.62500
What is a DOI?

ABSTRACT

The concordance of a full-text information retrieval system contains for every different word W of the data base, a list L(W) of “coordinates”, each of which describes the exact location of an occurrence of W in the text. The concordance should be compressed, not only for the savings in storage space, but also in order to reduce the number of I/O operations, since the file is usually kept in secondary memory. Several methods are presented, which efficiently compress concordances of large fulltext retrieval systems. The methods were tested on the concordance of the Responsa Retrieval Project and yield savings of up to 49% relative to the non-compressed file; this is a relative improvement of about 27% over the currently used prefix-omission compression technique.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Bratley P., Choueka Y., Processing truncated terms in document retrieval systems, Inf. Processing # Management 18 (1982) 257-266.
 
2
Choueka Y.# Fun text systems and research in the humanities, Computers and the Humanities XIV (1980) 153-169.
3
 
4
Fraenkel A.S., All about the Responsa Retrieval Project you always wanted to know but were afraid to ask, Expanded Summary, Jurimetrics J. 16 (1976) 14<.)-156.
 
5
Yraenkel A.S., Klein s.'r., Novel compression of sparse bitstrings preliminary report, Combinatorial Algorithms on Words, NATO ASI Series Vol. F12, Springer Vcrlag, Berlin (1985) 169- 183.

CITED BY  11

Collaborative Colleagues:
Y. Choueka: colleagues
A. S. Fraenkel: colleagues
S. T. Klein: colleagues