|
ABSTRACT
Classification, the development of rules for the allocation of observations to groups, is a fundamental machine learning task. A classic example is an automated system for a lending institution that decides whether to accept or reject a credit application. One might desire a machine that allows the non-classification of certain observations that exhibit attributes of belonging to more than one group. This option would allow inspection by an expert for "difficult" cases, or serve as an indication that more data needs to be collected. Classification with an option to reserve judgment on an observation is known as constrained discrimination. We consider a two-stage model for multi-category constrained discrimination in which limits on misclassification rates of training observations may be pre-specified. The mechanism by which the misclassification limits are satisfied is a rejection option, also known as a reserved judgment group, for observations not demonstrating properties of membership to any of the groups.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. A. Anderson. "Constrained discrimination between k populations." Journal of the Royal Statistical Society. Series B (Methodological), 31:123--139, 1969.
|
| |
2
|
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth and Brooks/Cole, 1984.
|
| |
3
|
J. D. Broffit, R. H. Randles, and R. V. Hogg. "Distribution-free partial discriminant analysis." Journal of the American Statistical Association, 71:934--939, 1976.
|
| |
4
|
J. P. Brooks and E. K. Lee. "Computing a multi-category constrained discrimination rule via mixed-integer programming and combinatorial optimization." Working paper.
|
| |
5
|
L. Devroye, L. Györfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1996.
|
| |
6
|
|
| |
7
|
F. A. Feltus, E. K. Lee, J. F. Costello, C. Plass, and P. M. Vertino. "Predicting aberrant CpG island methylation." Proceedings of the National Academy of Sciences, 100:12253--12258, 2003.
|
| |
8
|
F. A. Feltus, E. K. Lee, J. F. Costello, C. Plass, and P. M. Vertino. "Dna motifs associated with aberrant CpG island methylation." Genomics, 87:572--579, 2006.
|
| |
9
|
R. J. Gallagher, E. K. Lee, and D. A. Patterson. "Constrained discriminant analysis via 0/1 mixed integer programming." Annals of Operations Research, 74:65--88, 1997.
|
| |
10
|
L. Györfi, Z. Györfi, and I. Vajda. "Bayesian decision with rejection." Problems of Control and Information Theory, 8:445--452, 1979.
|
| |
11
|
J. D. F. Habbema, J. Hermans, and A. T. Van Der Burgt. "Cases of doubt in allocation problems." Biometrika, 61:313--324, 1974.
|
| |
12
|
D. J. Hand and W. E. Henley. "Statistical classification methods in consumer credit scoring: a review." J. R. Statist. Soc. A, 160:523--541.
|
| |
13
|
E. K. Lee, A. Y. C. Fung, J. P. Brooks, and M. Zaider. "Automated planning volume definition in soft-tissue sarcoma adjuvant brachytherapy." Biology in Physics and Medicine, 47:1891--1910, 2002.
|
| |
14
|
E. K. Lee, R. J. Gallagher, A. M. Campbell, and M. R. Prausnitz. "Prediction of ultrasound-mediated disruption of cell membranes using machine learning techniques and statistical analysis of acoustic spectra." IEEE Transactions on Biomedical Engineering, 51:1--9, 2004.
|
| |
15
|
|
| |
16
|
E. K. Lee, "Optimization-Based Predictive Models in Medicine and Biology." Optimization in Medicine, Springer Computer Science Series, 2006, to appear.
|
| |
17
|
O. L. Mangasarian and W. H. Wolberg. "Cancer diagnosis via linear programming." SIAM News, 23:1--18, 1990.
|
| |
18
|
D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.
|
| |
19
|
C. P. Quesenberry and M. P. Gessaman. "Nonparametric discrimination using tolerance regions." Annals of Mathematical Statistics, 39:664--673, 1968.
|
| |
20
|
|
| |
21
|
V. Vapnik. Statistical Learning Theory. Wiley, 1998.
|
|