ACM Home Page
Please provide us with feedback. Feedback
Orthogonal locality preserving indexing
Full text PdfPdf (242 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Salvador, Brazil
SESSION: Theory 1 table of contents
Pages: 3 - 10  
Year of Publication: 2005
ISBN:1-59593-034-5
Authors
Deng Cai  University of Illinois at Urbana Champaign
Xiaofei He  University of Chicago
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 93,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1076034.1076039
What is a DOI?

ABSTRACT

We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from Latent Semantic Indexing which is optimal in the sense of global Euclidean structure, LPI is optimal in the sense of local manifold structure. However, LPI is extremely sensitive to the number of dimensions. This makes it difficult to estimate the intrinsic dimensionality, while inaccurately estimated dimensionality would drastically degrade its performance. One reason leading to this problem is that LPI is non-orthogonal. Non-orthogonality distorts the metric structure of the document space. In this paper, we propose a new algorithm called Orthogonal LPI. Orthogonal LPI iteratively computes the mutually orthogonal basis functions which respect the local geometrical structure. Moreover, our empirical study shows that OLPI can have more locality preserving power than LPI. We compare the new algorithm to LSI and LPI. Extensive experimental results show that Orthogonal LPI obtains better performance than both LSI and LPI. More crucially, it is insensitive to the number of dimensions, which makes it an efficient data preprocessing method for text clustering, classification, retrieval, etc.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems 14, 2001.
 
5
M. Belkin, P. Niyogi, and V. Sindhwani. On maniold regularization. Technical report tr-2004-05, Computer Science Department, The University of Chicago, 2004.
 
6
F. R. K. Chung. Spectral Graph Theory, volume 92 of Regional Conference Series in Mathematics. 1997.
 
7
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
8
 
9
 
10
G. H. Golub and C. F. V. Loan. Matrix computations. Johns Hopkins University Press, 3rd edition, 1996.
11
12
 
13
B. Kegl. Intrinsic dimension estimation using packing numbers. In Advances in Neural Information Processing Systems 15, 2002.
14
 
15
L. Lovasz and M. Plummer. Matching Theory. Akadémiai Kiadó, North Holland, Budapest, 1986.
 
16
P. Niyogi, S. Smale, and S. Weinberger. Finding the homology of submanifolds with high confidence from random samples. Technical report tr-2004-08, Department of Computer Science, University of Chicago, 2004.
17
 
18
S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323--2326, 2000.
19
 
20
J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319--2323, 2000.
 
21
U. von Luxburg, O. Bousquet, and M. Belkin. Limits of spectral clustering. In Advances in Neural Information Processing Systems 17, 2004.
22

CITED BY  11