|
ABSTRACT
Signature files provide an efficient access method for text in documents, but retrieval is usually limited to finding documents that contain a specified Boolean pattern of words. Effective retrieval requires that documents with similar meanings be found through a process of plausible inference. The simplest way of implementing this retrieval process is to rank documents in order of their probability of relevance. In this paper techniques are described for implementing probabilistic ranking strategies with sequential and bit-sliced signature tiles and the limitations of these implementations with regard to their effectiveness are pointed out. A detailed comparison is made between signature-based ranking techniques and ranking using term-based document representatives and inverted files. The comparison shows that term-based representations are at least competitive (in terms of efficiency) with signature files and, in some situations, superior.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
BERTINO, E., GIBBS, S., RABITTI, F., THANOS, C., AND TSICHRITZIS, D. A multimedia document server. In Proceedings of the Advanced Database Symposium (Japan, Aug. 29-30), 1986.
|
 |
3
|
|
| |
4
|
CHRISTODOULAKIS, S. AND FALOUTSOS, C. Design considerations for a message file server. IEEE Trans. Softw. Eng. SE-IO (1984), 201-210.
|
| |
5
|
CROFT, W.B. Document representation in probabilistic models of information retrieval. J. Am. Soc. Inf. Sci. 32 (1981), 451-457.
|
 |
6
|
|
| |
7
|
CROFT, W.B. Experiments with representation in a document retrieval system. Inf. Tech. 2 (1983), 1-21.
|
| |
8
|
CROFT, W.B. A comparison of the cosine correlation and the modified probabilistic model. Inf. Tech. 2 (1984), 113-114.
|
| |
9
|
CROFT, W.B. Boolean queries and term dependencies in probabilistic retrieval models. J. Am. Soc. Inf. Sci. 37 (1986), 71-77.
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
HARDING, A. F., LYNCH, M. F., AND WILLETT, P. Document retrieval using a serial bit string search. Inf. Process. Manage. 19 (1983), 1-8.
|
| |
16
|
HARPER, D.J. Relevance feedback in document retrieval systems: An evaluation of probabilistic strategies. Ph.D. dissertation, Computer Laboratory, Cambridge Univ., Cambridge, England, 1980.
|
| |
17
|
|
| |
18
|
PORTER, M.F. An algorithm for suffix stripping. In New models in probabilistic information retrieval. British Library Research and Development Report 5587, Cambridge Univ., Cambridge, England, 1980.
|
| |
19
|
|
| |
20
|
ROBERTS, C.S. Partial match retrieval via the method of superimposed codes. In Proceedings IEEE 67 (1979), 1624-1642.
|
| |
21
|
ROBERTSON, S.E. The probability ranking principle in IR. J. Doc. 33 (1977), 294-304.
|
| |
22
|
SACKS-DAVIS, R., AND RAMAMOHANARAO, K. A two level superimposed coding scheme for partial match retrieval. Inf. Syst. 8 (1983), 273-280.
|
| |
23
|
|
 |
24
|
|
 |
25
|
|
 |
26
|
|
| |
27
|
SPARCK JONES, K. Automatic indexing. J. Doc. 30 {1974), 393-432.
|
| |
28
|
SPARCK JONES, K. AND BATES, R. C,. Research on automatic indexing. British Library Research and Development Rep. 5464, Computer Laboratory, Cambridge Univ., Cambridge, England, 1977.
|
 |
29
|
|
| |
30
|
|
 |
31
|
|
| |
32
|
|
| |
33
|
VAN RIJSBERGEN, C.J. A non-classical logic for information retrieval. Comput. J. 29 (1986), 481-485.
|
CITED BY 19
|
|
|
|
|
J. K. Cringean , R. England , G. A. Manson , P. Willett, Parallel text searching in serial files using a processor farm, Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval, p.429-453, September 05-07, 1990, Brussels, Belgium
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Jane B. Grimson : Reviewer"
The dramatic growth of office information systems in recent years has
given a tremendous boost to research in information retrieval. This
paper addresses the important issue of providing efficient and
effective methods for the storage and retrie
more...
|