ACM Home Page
Please provide us with feedback. Feedback
Interruptible anytime algorithms for iterative improvement of decision trees
Full text PdfPdf (224 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 1st international workshop on Utility-based data mining table of contents
Chicago, Illinois
Pages: 78 - 85  
Year of Publication: 2005
ISBN:1-59593-208-9
Authors
Saher Esmeir  Technion--IIT, Haifa, Israel
Shaul Markovitch  Technion---IIT, Haifa, Israel
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 57,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1089827.1089837
What is a DOI?

ABSTRACT

Finding a minimal decision tree consistent with the examples is an NP-complete problem. Therefore, most of the existing algorithms for decision tree induction use a greedy approach based on local heuristics. These algorithms usually require a fixed small amount of time and result in trees that are not globally optimal. Recently, the LSID3 contract anytime algorithm was introduced to allow using extra resources for building better decision trees. A contract anytime algorithm needs to get its resource allocation a priori. In many cases, however, the time allocation is not known in advance, disallowing the use of contract algorithms. To overcome this problem, in this work we present two interruptible anytime algorithms for inducing decision trees. Interruptible anytime algorithms do not require their resource allocation in advance and thus must be ready to be interrupted and return a valid solution at any moment. The first interruptible algorithm we propose is based on a general technique for converting a contract algorithm to an interruptible one by sequencing. The second is an iterative improvement algorithm that repeatedly selects a subtree whose reconstruction is estimated to yield the highest marginal utility and rebuilds it with higher resource allocation. Empirical evaluation shows a good anytime behavior for both algorithms. The iterative improvement algorithm shows smoother performance profiles which allow more refined control.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
C. L. Blake and C. J. Merz. UCI repository of machine learning databases, 1998.
 
2
 
3
 
4
R. R. Bouckaert. Choosing between two learning algorithms based on calibrated tests. In ICML'03, pages 51--58, Washington, DC, USA, 2003.
 
5
 
6
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA, 1984.
 
7
 
8
9
 
10
 
11
12
 
13
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning: data mining, inference, and prediction. New York: Springer-Verlag, 2001.
 
14
 
15
 
16
L. Hyafil and R. L. Rivest. Constructing optimal binary decision trees is NP-complete. Information Processing Letters, 5(1):15--17, 1976.
 
17
 
18
D. J. Lizotte, O. Madani, and R. Greiner. Budgeted learning of naive bayes classifiers. In UAI'03, Acapulco, Mexico, 2003.
 
19
 
20
 
21
 
22
 
23
 
24
 
25
S. J. Russell and S. Zilberstein. Composing real-time systems. In IJCAI'91, pages 212--217, Sydney, Australia, 1991.
 
26
 
27
 
28
 
29
 
30
 
31
 
32
X. Zhu, S. Sun, S. E. Cheng, and M. Bern. Classification of protein crystallization imagery. In EMBS'04, San Francisco, CA, 2004.


Collaborative Colleagues:
Saher Esmeir: colleagues
Shaul Markovitch: colleagues