ACM Home Page
Please provide us with feedback. Feedback
Learning and making decisions when costs and probabilities are both unknown
Full text PdfPdf (920 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Francisco, California
Pages: 204 - 213  
Year of Publication: 2001
ISBN:1-58113-391-X
Authors
Bianca Zadrozny  University of California, San Diego, La Jolla, California
Charles Elkan  University of California, San Diego, La Jolla, California
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
AAAI : American Association for Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 81,   Citation Count: 52
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502512.502540
What is a DOI?

ABSTRACT

In many data mining domains, misclassification costs are different for different examples, in the same way that class membership probabilities are example-dependent. In these domains, both costs and probabilities are unknown for test examples, so both cost estimators and probability estimators must be learned. After discussing how to make optimal decisions given cost and probability estimates, we present decision tree and naive Bayesian learning methods for obtaining well-calibrated probability estimates. We then explain how to obtain unbiased estimators for example-dependent costs, taking into account the difficulty that in general, probabilities and costs are not independent random variables, and the training examples for which costs are known are not representative of all examples. The latter problem is called sample selection bias in econometrics. Our solution to it is based on Nobel prize-winning work due to the economist James Heckman. We show that the methods we propose perform better than MetaCost and all other known methods, in a comprehensive experimental comparison that uses the well-known, large, and challenging dataset from the KDD'98 data mining contest.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
S. D. Bay. UCI KDD archive. Department of Information and Computer Sciences, University of California, Irvine, 2000. http://kdd, its.uci.edu/.
 
3
P. N. Bennett. Assessing the calibration of naive Bayes' posterior estimates. Technical Report CMU-CS-00-155, School of Computer Science, Carnegie Mellon University, 2000.
 
4
 
5
 
6
L. Breiman, J. H. Friedman, R. A. Olsen, and C. J. Stone. Classification and Regression Trees. Wadsworth International Group, 1984.
 
7
8
 
9
P. Domingos and M. Pazzani. Beyond independence: Conditions for the optimality of the simple Bayesian classifier. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 105-112. Morgan Kanfmann Publishers, Inc., 1996.
 
10
C. Elkan. Cost-sensitive learning and decision-making when costs are unknown. In Workshop Notes, Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning, 2000.
 
11
C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Aug. 2001.
 
12
 
13
J. Heckman. Sample selection bias as a specification error. Econometrica, 47:153-161, 1979.
 
14
E. C. Malthouse. Assessing the performance of direct marketing scoring models. Journal of Interactive Marketing, 15(1):49-62, 2001.
 
15
 
16
F. Provost and P. Domingos. Well-trained PETs: Improving probability estimation trees. CDER Working Paper 2000-04-1S, Stern School of Business, New York University, NY, NY 10012, 2000.
 
17
 
18
 
19
P. Smyth, A. Gray, and U. Fayyad. Retrofitting decision tree classifiers using kernel density estimation. In Proceedings of the Twelfth International Conference on Machine Learning, pages 506-514. Morgan Kaufmann Publishers, Inc., 1995.
 
20
J. R. Sobehart, R. M. Stein, V. Mikityanskaya, and L. Li. Moody's public firm risk model: A hybrid approach to modeling short term default risk. Technical report, Moody's Investors Service, Global Credit Research, 2000. Available at http://www, moodysqra, com} research/crm/53853, asp.
 
21
K. Turner and J. Ghosh. Theoretical foundations linear and order statistics combiners for neural pattern classifiers. Technical Report TR-95-02-98, Computer and Vision Research Center, The University of Texas at Austin, 1995.
 
22
P. Turney. Cost-sensitive learning bibliography. Institute for Information Technology, National Research Council, Ottawa, Canada, 2000. http ://extractor.iit.nrc.ca/ bibliographies/cost -sensit ive. html.
 
23

CITED BY  52

Collaborative Colleagues:
Bianca Zadrozny: colleagues
Charles Elkan: colleagues