|
ABSTRACT
Neighbor search is a fundamental task in machine learning, especially in classification and retrieval. Efficient nearest neighbor search methods have been widely studied, with their emphasis on data structures but most of them did not consider the underlying global geometry of a data set. Recent graph-based semi-supervised learning methods capture the global geometry, but suffer from scalability and parameter tuning problems. In this paper we present a (nearest) neighbor search method where the underlying global geometry is incorporated and the parameter tuning is not required. To this end, we introduce deterministic walks as a deterministic counterpart of Markov random walks, leading us to use the minimax distance as a global dissimilarity measure. Then we develop a message passing algorithm for efficient minimax distance calculation, which scales linearly in both time and space. Empirical study reveals the useful behavior of the method in image retrieval and semi-supervised learning.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Carroll, J. D. (1995). Minimax length links of a dissimilarity matrix and minimum spanning trees. Psychometrika, 60, 371--374.
|
 |
4
|
|
| |
5
|
Delalleau, O., Bengio, Y., & Roux, N. L. (2005). Efficient non-parametric function induction in semi-supervised learning. Proceedings of International Workshop on Artificial Intelligence and Statistics (pp. 96--103).
|
| |
6
|
|
| |
7
|
|
| |
8
|
Garcke, J., & Griebel, M. (2005). Semi-supervised learning with sparse grids. Proceedings of ICML, Workshop on Learning with Partially Classified Training Data (pp. 19--28).
|
| |
9
|
Gower, J. C., & Ross, G. J. S. (1969). Minimum spanning tree and single-linkage cluster analysis. Applied Statistics, 18, 54--64.
|
| |
10
|
Kschischang, F. R., Frey, B. J., & Loeliger, H. A. (2001). Factor graphs and the sum-product algorithm. IEEE Trans. Information Theory, 47, 498--519.
|
| |
11
|
Roweis, S. T., & Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290, 2323--2326.
|
| |
12
|
|
 |
13
|
Daniel A. Spielman , Shang-Hua Teng, Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, June 13-16, 2004, Chicago, IL, USA
[doi> 10.1145/1007352.1007372]
|
| |
14
|
Szummer, M., & Jaakkola, T. (2002). Partially labeled classification with Markov random walks. Advances in Neural Information Processing Systems (pp. 945--952). MIT Press.
|
| |
15
|
Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319--2323.
|
 |
16
|
|
| |
17
|
Xing, E. P., Ng, A. Y., Jordan, M. I., & Russel, S. (2003). Distance metric learning, with application to clustering with side-information. Advances in Neural Information Processing Systems. MIT Press.
|
| |
18
|
Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Schöölkopf, B. (2004). Learning with local and global consistency. Advances in Neural Information Processing Systems (pp. 321--328). MIT Press.
|
| |
19
|
Zhou, D., & Schöölkopf, B. (2004). Learning from labeled and unlabeled data using random walks. Pattern Recognition, Proceedings of the 26th DAGM Symposium (pp. 237--244).
|
| |
20
|
Zhu, X., Ghahramani, Z., & Lafferty, J. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. Proceedings of International Conference on Machine Learning (pp. 912--919).
|
 |
21
|
|
|