ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Access methods for text
Full text PdfPdf (2.59 MB)
Source ACM Computing Surveys (CSUR) archive
Volume 17 ,  Issue 1  (March 1985) table of contents
Annals of discrete mathematics, 24
Pages: 49 - 74  
Year of Publication: 1985
ISSN:0360-0300
Author
Chris Faloutsos  Univ. of Toronto, Toronto, Ont., Canada
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 105,   Citation Count: 86
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/4078.4080
What is a DOI?

ABSTRACT

This paper compares text retrieval methods intended for office systems. The operational requirements of the office environment are discussed, and retrieval methods from database systems and from information retrieval systems are examined. We classify these methods and examine the most interesting representatives of each class. Attempts to speed up retrieval with special purpose hardware are also presented, and issues such as approximate string matching and compression are discussed. A qualitative comparison of the examined methods is presented. The signature file method is discussed in more detail.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
ANGELL, R. C., FREUND, G. E., AND WILLET, P. 1983. Automatic spelling correction using a trigram similarity measure. Inf. Process. Manage. 19, 4, pp. 255-261.
5
 
6
BATCHER, K. E. 1968. Sorting networks and their applications. In Proceedings of the AFIPS Spring Joint Computer Conference. AFIPS Press, Reston, Va., pp. 307-314.
 
7
BAYER, R., AND MCCREIGHT, E. 1972. Organization and maintenance of large ordered indexes. Acta Inf. 1, 3, pp. 173-189.
8
9
 
10
BOURNE, C. P. 1963. Methods of Information Handling. Wiley, New York.
 
11
BOURNE, C. P. 1977. Frequency and impact of spelling errors in bibliographic databases. Inf. Process. Manage. 13, 1, pp. 1-12.
12
 
13
BRAUEN, T. 1971. Document vector modification. In The SMART Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Englewood Cliffs, N.J., Ch. 24.
 
14
BURKOWSKI, F. J. 1982. A hardware hashing scheme in the design of a multiterm string comparator. IEEE Trans. Comput. C-31, 9 (Sept.), 825-834.
 
15
CHRISTODOULAKIS, S. 1983. Access files for hatching queries in large information systems. In Proceedings of ICOD II (Aug.).
 
16
 
17
CHRISTODOULAKIS, S., AND FALOUTSOS, C. 1984. Design considerations for a message file server. IEEE Trans. Softw. Eng. SE-IO, 2 (Mar.), 201- 210.
18
 
19
COOPER, W. S. 1970. On deriving design equations for information retrieval systems. J. Am. Soc. Inf. Sci. (Nov.-Dec.).
 
20
CROFT, W. B. 1980. A model of cluster searching based on classification. Inf. Syst. 5, 189-195.
21
 
22
DATrOLA, R. 1979. FIRST: Flexible information retrieval system for text. J. Am. Soc. Inf. Sci, 30, (Jan.), 9-14.
 
23
DE LA BRIANDAIS, S. R. 1959. File searching using variable length keys. In Proceedings of the AFIPS Spring Joint Computer Conference. AFIPS Press, Reston, Va., pp. 295-298.
 
24
DEWEY, C. 1950. Relative Frequency of English Speech Sounds. Harvard University Press, Cambridge, Mass.
 
25
DUDA, R. O., AND HART, P. E. 1973. Pattern Classification and Scene Analysis. Wiley, New York.
26
 
27
FALOUTSOS, C. 1985a. Design of a signature file method that accounts for nonuniform occurrence and query frequencies. In Proceedings of the 11th Conference on Very Large Data Bases (Stockholm, Sweden, Aug.), to appear.
28
29
 
30
FILES, J. R., A~D HUSKEY, H. D. 1969. An information retrieval system based on superimposed coding. In Proceedings of the Fall Joint Computer Conference, vol. 35. AFIPS Press, Reston, Va., pp. 423- 432.
 
31
Fox, E. A. 1984. Extended information retrieval with data and text. PODS, submitted for publication.
32
 
33
FRIEDMAN, S. R., MACEYAK, J. A., AND WEISS, S. F. 1971. A relevance feedback system based on document transformation. In The SMART Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed., Prentice-Hall Englewood Cliffs, N.J., Ch. 23.
34
 
35
GALLAGER, R. G., AND VAN VOORH{$, D. C. 1975. Optimal source codes for geometrically distributed integer alphabets. IEEE Trans. Inf. Theor. IT~21 (March), 228-230.
 
36
GOLOMB, S. W. 1966. Run length encodings. IEEE Trans. Inf. Theor. IT-12 (July), 399-401.
 
37
GRAVINA, C. M. 1978. National Westminster Bank mass storage archiving. IBM Syst. J. 17, 4, 344- 358.
38
39
40
 
41
HASKIN, R. L. 1981. Special-purpose processors for text retrieval. Database Eng. 4, i (Sept.), 16-29.
42
43
44
 
45
HOLLAAR, L. A. 1979. Text retrieval computers. IEEE Comput. Mag. 12, 3 (Mar.), 40-50.
 
46
HOLLAAR, L. A., SMITH, K. F., CHOW, W. H., EMRATH, P. A., AND HASKIN, R. L. 1983. Architecture and operation of a large, full-text information-retrieval system. In Advanced Database Machine Architecture, D. K. Hsiao, Ed. Prentice- Hall, Englewood Cliffs, N.J., pp. 256-299.
 
47
 
48
IBM 1979. STAIRS/VS: Reference Manual. IBM Systern Manual.
 
49
 
50
KAUTZ, W. H., AND SINGLETON, R. C. 1964. Nonrandom binary superimposed codes. IEEE Trans. Inf. Theor. IT-IO (Oct.), 363-377.
 
51
KNOTT, G. D. 1971. Expandable open addressing hash table storage and retrieval. In Proceedings of the ACM SIGFIDET Conference (San Diego, Calif.). ACM, New York, pp. 187-206.
 
52
KNOTT, G. D. 1975. Hashing functions. Comput. J. 18, 3, pp. 265-278.
 
53
 
54
KNUTH, D. E., MORRIS, J. H., AND PRATT, V. R. 1977. Fast pattern matching in strings. SlAM J. Cornput. 6, 2 (June), 323-350.
 
55
LARSON, P. 1978. Dynamic hashing. BIT 18, pp. 184- 201.
56
 
57
LESK, M. E. 1978. Some applications of inverted indexes on the UNIX system. In the UNIX Programmer's Manual. Bell Laboratories, Murray Hill, N.J.
 
58
LLOYD, J. W. 1980. Optimal partial-match retrieval. BIT 20, pp. 406-413.
 
59
LLOYD, J. W., AND RAMAMOHANARAO, K. 1982. Partial-match retrieval for dynamic files. BIT, 22, pp. 150-168.
60
 
61
 
62
MCILROY, M. D. 1982. Development of a spelling list. IEEETrans. Commun. COM-30, 1 (Jan.), 91-99.
 
63
McLEoo, I. A. 1981. A database management system for document retrieval applications. Inf. Syst. 6, 2, pp. 131-137.
 
64
MOOERS, C. 1949. Application of random codes to the gathering of statistical information. Bull. 31, Zator Co., Cambridge, Mass., 1949. Based on M.S.C. thesis, MIT, Jan. 1948.
65
 
66
OROSZ, G., AND TACKACS, L. 1956. Some probability problems concerning the marking of codes into the superimposed field. J. Doc. 12, 4 (Dec.), 231- 234.
67
68
 
69
 
70
RIVEST, R. L. ~976. Partial match retrieval algorithms. SIAM J. Comput. 5, 1 (Mar.), 19-50.
 
71
ROBERTS, C. S. 1979. Partial-match retrieval via the method of superimposed codes. Proc. iEEE 67, 12 (Dec.), 1624-1642.
72
 
73
Roccmo, J. J. 1971. Relevance feedback in information retrieval. In The SMART Retrieval System-- Experiments in Automatic Document Processing, G. Salton, Ed., Prentice-Hall, Englewood Cliffs, N.J., Ch. 14.
74
 
75
 
76
SALTON, G. 1971b. Relevance feedback and the optimization of retrieval effectiveness. In The SMART Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed., Prentice-Hall, Englewood Cliffs, N.J., Ch. 15.
 
77
SALTON, G. 1972. Experiments in automatic thesaurus construction for information retrieval. In Informarion Processing 71. North-Holland, Amsterdam, pp. 115-123.
78
 
79
 
80
SALTO~, G. 1980. Automatic information retrieval. IEEE Comput. Mag. 13, 9 (Sept.), 41-56.
 
81
82
83
 
84
SCHUEGRAPH, E. J., ANO HEAPS, H. 8. 1976. Query processing in a retrospective document retrieval system that uses word fragments as language elements. Inf. Process. Manage. 12, pp. 283- 292.
85
 
86
SEVERANCE, D. G. 1983. A practitioner's Guide to database compression. In{. Syst. 8, 1, 51-62.
 
87
8PARCK-JONES, K. 1972. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28, 1 (Mar.), 11-20.
 
88
STELLHORN, W. H. 1977. An inverted file processor for information retrieval. IEEE Trans. Comput. C-26, 12 (Dec.), 1258-1267.
 
89
STIASSNY, 8. 1960. Mathematical analysis of various superimposed coding methods. Am. Doc. 11, 2 (Feb.), 155-169.
90
91
 
92
 
93
VAN RIJSBERGEN, C. J. 1971. An algorithm for information structuring and retrieval. Comput. J. 14, 4, pp. 407-412.
 
94
 
95
WELCH, T. A. 1984. A technique for high-performance data compression. IEEE Comput. Mag. 17, 6 (June), 8-19.
 
96
WINe, J. M. 1979. Partial-match retrieval using TRIES, hashing and superimposed codes. M.Sc. thesis, MIT, June.
97
98
 
99
ZAHN, C. T. 1971. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comput. C-20, i (Jan.), 68-86.
 
100
ZIV, J., AND LEMPEL, A. 1977. A universal algorithm for sequential data compression. IEEE Trans. In{. Theor. IT-23, 3 (May), 337-343.

CITED BY  86


REVIEW

"Harold Borko : Reviewer"

Automated text retrieval methods are used in libraries and in many online bibliographic and numerical data files. They provide the library user with fast and accurate access to books and journal papers. Text retrieval is also important in office  more...