ACM Home Page
Please provide us with feedback. Feedback
Active learning with direct query construction
Full text PdfPdf (1.71 MB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Las Vegas, Nevada, USA
SESSION: Research papers table of contents
Pages 480-487  
Year of Publication: 2008
ISBN:978-1-60558-193-4
Authors
Charles X. Ling  The University of Western Ontario, London, ON, Canada
Jun Du  The University of Western Ontario, London, ON, Canada
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 253,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1401890.1401950
What is a DOI?

ABSTRACT

Active learning may hold the key for solving the data scarcity problem in supervised learning, i.e., the lack of labeled data. Indeed, labeling data is a costly process, yet an active learner may request labels of only selected instances, thus reducing labeling work dramatically. Most previous works of active learning are, however, pool-based; that is, a pool of unlabeled examples is given and the learner can only select examples from the pool to query for their labels. This type of active learning has several weaknesses. In this paper we propose novel active learning algorithms that construct examples directly to query for labels. We study both a specific active learner based on the decision tree algorithm, and a general active learner that can work with any base learning algorithm. As there is no restriction on what examples to be queried, our methods are shown to often query fewer examples to reduce the predictive error quickly. This casts doubt on the usefulness of the pool in pool-based active learning. Nevertheless, our methods can be easily adapted to work with a given pool of unlabeled examples.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
C. Blake, E. Keogh, and C. J. Merz. Uci repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html, 1998.
 
4
 
5
 
6
 
7
 
8
A. Kapoor and R. Greiner. Learning and classifying under hard budgets. pages 170--181. 2005.
 
9
 
10
D. D. Margineantu. Active cost-sensitive learning. In the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, 2005.
 
11
 
12
 
13
 
14
 
15
 
16
17
 
18
 
19
 
20
T. Zhang and F. J. Oles. A probability analysis on the value of unlabeled data for classification problems. In Proc. 17th International Conf. on Machine Learning, pages 1191--1198, 2000.