| Compression of concordances in full-text retrieval systems |
| Full text |
Pdf
(1.36 MB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Grenoble, France
Pages: 597 - 612
Year of Publication: 1988
ISBN:2-7061-0309-4
|
|
Authors
|
|
Y. Choueka
|
Dept. of Math. and Computer Science, Bar-Ilan University, Ramat Gan, Israel
|
|
A. S. Fraenkel
|
Dept. of Appl. Math. and Comp. Sc., Weizmann Institute of Science, Rehovot, Israel
|
|
S. T. Klein
|
Graduate Library School and Comp. Sc. Dept., University of Chicago, IL
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 4, Downloads (12 Months): 25, Citation Count: 11
|
|
|
ABSTRACT
The concordance of a full-text information retrieval system contains for every different word W of the data base, a list L(W) of “coordinates”, each of which describes the exact location of an occurrence of W in the text. The concordance should be compressed, not only for the savings in storage space, but also in order to reduce the number of I/O operations, since the file is usually kept in secondary memory. Several methods are presented, which efficiently compress concordances of large fulltext retrieval systems. The methods were tested on the concordance of the Responsa Retrieval Project and yield savings of up to 49% relative to the non-compressed file; this is a relative improvement of about 27% over the currently used prefix-omission compression technique.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bratley P., Choueka Y., Processing truncated terms in document retrieval systems, Inf. Processing # Management 18 (1982) 257-266.
|
| |
2
|
Choueka Y.# Fun text systems and research in the humanities, Computers and the Humanities XIV (1980) 153-169.
|
 |
3
|
A. S. Fraenkel , S. T. Klein , Y. Choueka , E. Segal, Improved hierarchical bit-vector compression in document retrieval systems, Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval, p.88-96, September 1986, Palazzo dei Congressi, Pisa, Italy
[doi> 10.1145/253168.253190]
|
| |
4
|
Fraenkel A.S., All about the Responsa Retrieval Project you always wanted to know but were afraid to ask, Expanded Summary, Jurimetrics J. 16 (1976) 14<.)-156.
|
| |
5
|
Yraenkel A.S., Klein s.'r., Novel compression of sparse bitstrings preliminary report, Combinatorial Algorithms on Words, NATO ASI Series Vol. F12, Springer Vcrlag, Berlin (1985) 169- 183.
|
|