ACM Home Page
Please provide us with feedback. Feedback
An analysis of latent semantic term self-correlation
Full text PdfPdf (494 KB)
Source
ACM Transactions on Information Systems (TOIS) archive
Volume 27 ,  Issue 2  (February 2009) table of contents
Article No. 8  
Year of Publication: 2009
ISSN:1046-8188
Authors
Laurence A. F. Park  The University of Melbourne, Australia
Kotagiri Ramamohanarao  The University of Melbourne, Australia
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 34,   Downloads (12 Months): 264,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1462198.1462200
What is a DOI?

ABSTRACT

Latent semantic analysis (LSA) is a generalized vector space method that uses dimension reduction to generate term correlations for use during the information retrieval process. We hypothesized that even though the dimension reduction establishes correlations between terms, the dimension reduction is causing a degradation in the correlation of a term to itself (self-correlation). In this article, we have proven that there is a direct relationship to the size of the LSA dimension reduction and the LSA self-correlation. We have also shown that by altering the LSA term self-correlations we gain a substantial increase in precision, while also reducing the computation required during the information retrieval process.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
Dumais, S. T. 1991. Improving the retrieval of information from external sources. Behav. Resear. Meth., Instrum. Comput. 23, 2, 229--236.
 
5
Dumais, S. T. 1994. Latent semantic indexing (lsi): Trec-3 report. In Proceedings of the 3rd Text REtrieval Conference (TREC-3).
 
6
Eckart, C. and Young, G. 1936. The approximation of one matrix by another of lower rank. Psychometrika 1, 3, 211--218.
 
7
 
8
Harman, D. 1994. Overview of the 3rd text retrieval conference (TREC-3). D. Harman Ed. National Institute of Standards and Technology Special Publication 500-226. 1--20.
 
9
 
10
 
11
 
12
 
13
Kontostathis, A. and Pottenger, W. 2002. Detecting patterns in the lsi term-term matrix. Tech. rep. LU-CSE-02-010, Department of Computer Science and Engineering, Lehigh University.
 
14
 
15
16

Collaborative Colleagues:
Laurence A. F. Park: colleagues
Kotagiri Ramamohanarao: colleagues