ACM Home Page
Please provide us with feedback. Feedback
Active learning using pre-clustering
Full text PdfPdf (167 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 79  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Hieu T. Nguyen  University of Amsterdam, Amsterdam, The Netherlands
Arnold Smeulders  University of Amsterdam, Amsterdam, The Netherlands
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 114,   Citation Count: 16
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015349
What is a DOI?

ABSTRACT

The paper is concerned with two-class active learning. While the common approach for collecting data in active learning is to select samples close to the classification boundary, better performance can be achieved by taking into account the prior data distribution. The main contribution of the paper is a formal framework that incorporates clustering into active learning. The algorithm first constructs a classifier on the set of the cluster representatives, and then propagates the classification decision to the other samples via a local noise model. The proposed model allows to select the most representative samples as well as to avoid repeatedly labeling samples in the same cluster. During the active learning process, the clustering is adjusted using the coarse-to-fine strategy in order to balance between the advantage of large clusters and the accuracy of the data representation. The results of experiments in image databases show a better performance of our algorithm compared to the current methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Chapelle, O., Weston, J., & Scholkopf, B. (2002). Cluster kernels for semi-supervised learning. Advances in Neural Information Processing Systems.
 
3
Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence research, 4, 129--145.
 
4
Kaufman, L., & Rousseeuw, P. (1990). Finding groups in data: An introduction to cluster analysis. John Wiley & Sons.
 
5
 
6
 
7
Miller, D., & Uyar, H. (1996). A mixture of experts classifier with learning based on both labelled and unlabelled data. Advances in Neural Information Processing Systems 9 (pp. 571--577).
 
8
 
9
 
10
 
11
 
12
Seeger, M. (2001). Learning with labeled and unlabeled data (Technical Report). Edinburgh University.
 
13
Shen, X., & Zhai, C. (2003). Active feedback - UIUC TREC-2003 HARD experiments. The 12th Text Retrieval Conference, TREC.
 
14
 
15
16
 
17
 
18
Xu, Z., Yu, K., Tresp, V., Xu, X., & Wang, J. (2003). Representative sampling for text classification using support vector machines. 25th European Conf. on Information Retrieval Research, ECIR 2003. Springer.
 
19
Zhang, C., & Chen, T. (2002). An active learning framework for content-based information retrieval. IEEE trans on multimedia, 4, 260--268.
 
20
Zhang, T., & Oles, F. (2000). A probability analysis on the value of unlabeled data for classification problems. Proc. Int. Conf. on Machine Learning.
 
21
 
22
Zhu, J., & Hastie, T. (2001). Kernel logistic regression and the import vector machine. Advances in Neural Information Processing Systems.

CITED BY  16
Collaborative Colleagues:
Hieu T. Nguyen: colleagues
Arnold Smeulders: colleagues