|
ABSTRACT
Latent Semantic Indexing (LSI) is a well established and effective framework for conceptual information retrieval. In traditional implementations of LSI the semantic structure of the collection is projected into the k-dimensional space derived from a rank-k approximation of the original term-by-document matrix. This paper discusses a new way to implement the LSI methodology, based on polynomial filtering. The new framework does not rely on any matrix decomposition and therefore its computational cost and storage requirements are low relative to traditional implementations of LSI. Additionally, it can be used as an effective information filtering technique when updating LSI models based on user feedback.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Berry and M. Browne. Understanding search engines. SIAM, 1999.
|
| |
2
|
|
| |
3
|
M. Berry and R. Fierro. Low-rank orthogonal decompositions for information retrieval applications. Numer. Lin. Alg. Appl., 1:1--27, 1996.
|
| |
4
|
E. Chisholm and T. Kolda. New term weighting formulas for the vector space method in information retrieval. Technical report, Oak Ridge National Laboratory, 1999.
|
| |
5
|
J. Cullum and R. Willoughby. Computing eigenvalues of very large symmetric matrices an implementation of a Lanczos algorithm with no reorthogonalization. J. Comput. Phys., 44:329--358, 1981.
|
| |
6
|
P. Davis. Interpolation and approximation. Blaisdell, Waltham, MA, 1963.
|
| |
7
|
S. Deerwester, S. Dumais, G. Furnas, T. Landauer and R. Harshman. Indexing by latent semantic analysis. J. Soc. Inf. Sci., 41:391--407, 1990.
|
| |
8
|
S. Dumais. Improving the retrieval of information from external sources. Behav. Res. Methods, Instr. Comput., 23:229--236, 1991.
|
| |
9
|
J. Erhel, F. Guyomarc and Y. Saad. Least-squares polynomial filters for ill-conditioned linear systems. Technical report umsi-2001-32. Minnesota Supercomputing Institute. 200 Union Street S.E.. Minneapolis. MN 55455, 2001.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. J. Amer. Soc. Info. Sci., 41:288--297, 1990.
|
| |
15
|
D. Zeimpekis and E. Gallopoulos. TMG: A MATLAB-based term-document Matrix Constructor for Text Collections. Technical report, Comp. Eng. and Inf. Dept., University of Patras, December 2003.
|
|