ACM Home Page
Please provide us with feedback. Feedback
Recent Studies in Automatic Text Analysis and Document Retrieval
Full text PdfPdf (1.56 MB)
Source Journal of the ACM (JACM) archive
Volume 20 ,  Issue 2  (April 1973) table of contents
Pages: 258 - 278  
Year of Publication: 1973
ISSN:0004-5411
Author
G. Salton  Cornell University, Department of Computer Science, Ithaca, New York
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 67,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/321752.321757
What is a DOI?

ABSTRACT

Many experts in mechanized text processing now agree that useful automatic language analysis procedures are largely unavailable and that the existing linguistic methodologies generally produce disappointing results. An attempt is made in the present study to identify those automatic procedures which appear most effective as a replacement for the missing language analysis. A series of computer experiments is described, designed to simulate a conventional document retrieval environment. It is found that a simple duplication, by automatic means, of the standard, manual document indexing and retrieval operations will not produce acceptable output results. New mechanized approaches to document handling are proposed, including document ranking methods, automatic dictionary and word list generation, and user feedback searches. It is shown that the fully automatic methodology is superior in effectiveness to the conventional procedures in normal use.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
GARVIN, P. L., et al. Some opinions concerning linguistics and reformation processing. Rep. PB 190 639, Center for Applied Linguistics, May 1969. Available from National Technical Information Service, Washington, D.C.
2
3
 
4
BAXENDALt~, P. An empirical model for machine indexing. Third Institute on Information Storage and Retrieval, American U, Washington, D.C., Feb. 1961, pp. 207-218.
 
5
CLARK~, D. C, AND WALL, R E. An economical program for the limited parsing of English, Proc AFIPS 1965 FJCC, Vol. 27, Pt. 1, Spartan Books, New York, pp. 307-319.
6
 
7
RusH, J. E., SALVADOR, R., AND ZAMORA, h. Automatic abstracting and indexing: Production of indicative abstracts by application of contextual inference and syntactic coherence criteria. J. ASIS 22, 4 (July-Aug. 1971), 260-274.
 
8
SALTON, G. Automatic text analysis. Science 168, 3929 (17 Apr. 1970), 335-343.
 
9
CI~EVERDON, C. W., aND KnEN, E.M. Factors determining the performance of indexing systems; Vol. 2--test results. Ashb Cranfield Res. Proj., Cranfield, England, 1966.
10
 
11
SALTON, G. Automatic processing of foreign language documents. J. ASIS 21, 3 (May- June 1970), 187-194.
 
12
DENNIS, S.F. The design and testing of a fully automatic indexing-searching system for documents consisting of expository text. In Information Retr~eval--A Critical View, G. Schecter, Ed., Thompson Book Co., Washington, D.C., 1967.
 
13
GIULIANO, V. E , AND JoNas, P E. Study and test of a methodology for laboratory evaluation of message retrieval systems. Rep. ESD-TR-66-405, Arthur D. Little, Cambridge, Mass., 1966.
 
14
SrxRc~ JONES, K. Automatic Keyword Classification for Information Retrieval. Butterworth and Co., London, 1971.
 
15
STEVENS, M.E. Automatic indexing: A state of the art report. NBS Monograph 91, U.S. Bureau of Standards, Washington, D.C., March 1965.
 
16
STEVENS, M E., GIULIANO, V. E., AND HEILPRIN, L.B. Statistical association methods for mechanized documentation. NBS Misc. Pub. 269, U. S. Bureau of Standards, Washington, D.C, Dec. 1965
 
17
SWANSON, D.R. Searching natural language text by computer. Science lS$, 3434 (Oct. 21, 1960), 1099-1104.
 
18
SWANSON, D.R. Interrogating a computer in natural language. Proc. IFIP Cong. 1962, North-Holland Pubhshing Co., Amsterdam, p. 288-393.
 
19
The Principles of Medlars. National Library of Medicine, Bethesda, Md., 1970. Available from Superintendent of Documents, Washington, D.C.
 
20
 
21
 
22
SALTON, G. A comparison between manual and automatic indexing methods. American Documentation 20, 1 (Jan 1969), 61-71
 
23
SALTON, G. A new comparison between conventional indexing (Medlars) and automatic text processing (SMART). J.ASIS 28, 2 (March-April 1972), 75-84.
 
24
LANCASTER, F.W. Evaluation of the Medlars demand search service. National Library of Medicine, Bethesda, Md., Jan. 1968.
 
25
SALTON, G. Search and retrieval experiments in reM-time information retrieval. In Informahon Processvng 68 (Proc IFIP Cong.), North-Holland Publishing Company, Amsterdam, 1969, pp. 1082-1093.
 
26
SALTON, G The performance of interactive information retrieval. Information Processing Letters 1, 2 (July 1971), 35-41.
 
27
BORKO, H. The construction of an empirically based mathematically derived classificatmn system. Rep. SP-588, System Development Corp., Santa Monica, Calif., Oct. 1961.
28
 
29
DOYLE, L.B. Breaking the cost barrier in automatic classification, Rep. SP-2516, System Develpment Corp,, Santa Monica, Calif , July 1966.
30
 
31
DATTOLA, R.T. Expemments with a fast algorithm for automatic classificatmn. In The SMART Retrieval System--Experiments in Automatic Document Processing, G. Salton, Ed., Prentice-Hall, Englewood Cliffs, N J., 1971
 
32
JohNson, D. B., AND LAFUENTE, J.M. A controlled single-pass classification algorithm with application to multi-level clustering. Sci. Rep. ISR-18, See. XII, Dept. of Computer Science, Cornell U., Ithaca, N.Y., Oct 1970.
 
33
BONWlT, K., AND ASTE TONSMAN, J. Negative dictmnaries. Sci. Rep. ISR-18, Sec. VI, Dept. of Computer Science, Cornell University, Ithaca, N Y., Oct. 1970.
 
34
SALTON, G. Experiments in automatic thesaurus construction for information retrieval. Proc. IFIP Congress 71, Ljubljana, North-Holland Publishing Co., Amsterdam, 1972, pp. 115-123.

CITED BY  9