ACM Home Page
Please provide us with feedback. Feedback
A statistical approach to decision tree modeling
Full text PdfPdf (786 KB)
Source Annual Workshop on Computational Learning Theory archive
Proceedings of the seventh annual conference on Computational learning theory table of contents
New Brunswick, New Jersey, United States
Pages: 13 - 20  
Year of Publication: 1994
ISBN:0-89791-655-7
Author
Michael I. Jordan  Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 25,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/180139.175372
What is a DOI?

ABSTRACT

A statistical approach to decision tree modeling is described. In this approach, each decision in the tree is modeled parametrically as is the process by which an output is generated from an input and a sequence of decisions. The resulting model yields a likelihood measure of goodness of fit, allowing ML and MAP estimation techniques to be utilized. An efficient algorithm is presented to estimate the parameters in the tree. The model selection problem is presented and several alternative proposals are considered. A hidden Markov version of the tree is described for data sequences that have temporal dependencies.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Baum, L.E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 41, 164-171.
 
2
Bengio, Y., & Frasconi, P. (in press). Credit assignment through time: Alternatives to backpropagation. Neural Information Processing Systems 6. San Marco, CA: Morgan Kaufmann.
 
3
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Belmont, CA: Wadsworth International Group.
 
4
Cacciatore, T & Nowlan, S. (in press). Mixtures of controllers for jump linear and non-linear plants. Neural Information Processing Systems 6. San Mateo, CA: Morgan Kaufmann.
 
5
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 39, 1-38.
 
6
Draper, N. R., & Smith, H. (t981). Applied Regression Analysis. New York: John Wiley.
 
7
Duda, R. O., & Hart, P. E. (1973). Pattern Classification and Scene Analysis. New York: John Wiley.
 
8
Ghahramani, Z., & Jordan, M. I. (in press). Supervised learning from incomplete data via the EM approach. Neural Information Processing Systems 6. San Mateo, CA: Morgan Kaufmann.
 
9
 
10
 
11
McCullagh, P. & Nelder, J.A. (1983). Generalized Linear Models. London: Chapman and Hall.
 
12
Meila, M. P., & Jordan, M. I. (1994). Learning the parameters of HMMs with auxiliary input. MIT Computational Cognitive Science Tech. Rep. 9401, Cambridge, MA.
 
13
Murthy, S. K., Kasif, S., & Salzberg, S. (1993). OCI.' A randomized algorithm for building oblique decision trees. Technical Report, Department of Computer Science, Johns Hopkins University.
 
14
Neal, R., & Hinton, G. E. (1993). A new view of the EM algorithm that justifies incremental and other variants. Submitted to Biometrika.
 
15
 
16
 
17
 
18
Xu, L., Jordan, M. I., & Hinton, G. E. (1994). An alternative mixture of experts model. MIT Computational Cognitive Science Tech. Rep. 9402, Cambridge, MA.