ACM Home Page
Please provide us with feedback. Feedback
Locality preserving indexing for document representation
Full text PdfPdf (279 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Sheffield, United Kingdom
SESSION: Dimensionality reduction table of contents
Pages: 96 - 103  
Year of Publication: 2004
ISBN:1-58113-881-4
Authors
Xiaofei He  University of Chicago, Chicago, IL
Deng Cai  Tsinghua University, Beijing, China
Haifeng Liu  University of Toronto, Toronto, Canada
Wei-Ying Ma  Microsoft Research Asia, Beijing, China
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 101,   Citation Count: 15
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1008992.1009012
What is a DOI?

ABSTRACT

Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Indexing (LSI) is considered effective in deriving such an indexing. LSI essentially detects the most representative features for document representation rather than the most discriminative features. Therefore, LSI might not be optimal in discriminating documents with different semantics. In this paper, a novel algorithm called Locality Preserving Indexing (LPI) is proposed for document indexing. Each document is represented by a vector with low dimensionality. In contrast to LSI which discovers the global structure of the document space, LPI discovers the local structure and obtains a compact document representation subspace that best detects the essential semantic structure. We compare the proposed LPI approach with LSI on two standard databases. Experimental results show that LPI provides better representation in the sense of semantic structure.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
M. Belkin and P. Niyogi, "Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering", Advances in Neural Information Processing Systems 14, Vancouver, Canada, 2001.
5
 
6
Fan R. K. Chung, Spectral Graph Theory, Regional Conferences Series in Mathematics, number 92, 1997.
 
7
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. harshman, "Indexing by Latent Semantic Analysis", Journal of the American Society of Information Science, 41(6):391--407, 1990.
 
8
L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. Springer-Verlag New York, Inc., 1996.
9
 
10
11
12
 
13
Xiaofei He and Partha Niyogi, "Locality Preserving Projections", in Advances in Neural Information Processing Systems 16, Vancouver, Canada, 2003.
14
 
15
16
 
17
K. Lang, "Learning to filter netnews", Proc. Of the 12th Int. Conf. on Machine Learning, 1995.
18
 
19
S. T. Roweis, L. K. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding", Science, vol 290, 22 December 2000.
 
20
 
21
J. B. Tenenbaum, Vin De Silva, and J. C. Langford, "A Global Geometric Framework for Nonlinear Dimensionality Reduction", Science, Vol 290, 22 December 2000.
22

CITED BY  16

Collaborative Colleagues:
Xiaofei He: colleagues
Deng Cai: colleagues
Haifeng Liu: colleagues
Wei-Ying Ma: colleagues