|
ABSTRACT
This paper describes a unique example-based mapping method for document retrieval. We discovered that the knowledge about relevance among queries and documents can be used to obtain empirical connections between query terms and the canonical concepts which are used for indexing the content of documents. These connections do not depend on whether there are shared terms among the queries and documents; therefore, they are especially effective for a mapping from queries to the documents where the concepts are relevant but the terms used by article authors happen to be different from the terms of database users. We employ a Linear Least Squares Fit (LLSF) technique to compute such connections from a collection of queries and documents where the relevance is assigned by humans, and then use these connections in the retrieval of documents where the relevance is unknown. We tested this method on both retrieval and indexing with a set of MEDLINE documents which has been used by other information retrieval systems for evaluations. The effectiveness of the LLSF mapping and the significant improvement over alternative approaches was evident in the tests.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Haynes R, McKibbon K, Walker C, Ryan N, Fitzgerald D, Ramsden M. Online access to MEDLINE in clinical settings. Ann. Int. Med. 1990;112:78-84.
|
| |
2
|
Lindberg D, Humphreys B. The UMLS knowledge sources: tools for building better user interfaces. Proc l~th Ann Symp Comp Applic Med Care 1990;14:121- 125.
|
| |
3
|
Hersh WR, Haynes RB. Evaluation of SAPHIRE: an automated approach to indexing and retrieving medical literature. Proc 15th Ann Syrup Comp Applic Med Care 1991;15:808-812.
|
| |
4
|
Cousins SB, Silverstein JC, Frisse ME. Query networks for medical information retrieval - assigning probabilistic relationships. Proc l~th Ann Syrup Comp Applic Med Care 1990;14:121-125.
|
| |
5
|
Salton G. Development in Automatic Text Retrieval, Science 1991:253:974-980.
|
| |
6
|
Wu H, Salton G. The estimation of term relevance weights using relevance feedback. J Documentation 1981;37:194-219.
|
| |
7
|
|
| |
8
|
Yang Y, Chute CG. An application of least squares fit mapping to clinical classification. Proc 15th Ann Symp Com~ Applic Med Care 1992:16:460-464.
|
| |
9
|
Lawson CL, and Hanson RJ. Solving Least Squares Problems. Englewood Cliffs, N.J.: Prentice-Hall, 1974.
|
| |
10
|
|
| |
11
|
Medical Subject Headings (MESH}. Bethesda, MD: National Library of Medicine, 1993.
|
| |
12
|
Hersh WR, Hickam DH, Leone TJ. Words, concepts, or both' optimal indexing units for automated information retrieval. Proc 16th Ann Symp Comp Applic Med Care 1992;16:644-648.
|
| |
13
|
M-t--~- Class Library, User Guide, Release 3. Bellevue, WA: Dyad Software Corporation, 1991.
|
| |
14
|
Dongarra JJ, Moler CB, Bunch JR, Stewart GW. LINPACK Users' Guide. Philadelphia, PA: SIAM, 1979.
|
| |
15
|
Yang Y, Chute CG. A numerical solution for text information retrieval and its application in patient data classification. Technical Report Series, No. 50, Section of Biostatistics, Mayo Clinic, Rochester, MN, 1992.
|
CITED BY 8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Michael W. Berry , Susan T. Dumais , Todd A. Letsche, Computational Methods for Intelligent Information Access, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.20, December 04-08, 1995, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|