| An empirical evaluation of supervised learning in high dimensions |
| Full text |
Pdf
(247 KB)
|
| Source
|
ICML; Vol. 307
archive
Proceedings of the 25th international conference on Machine learning
table of contents
Helsinki, Finland
Pages 96-103
Year of Publication: 2008
ISBN:978-1-60558-205-4
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 12, Downloads (12 Months): 89, Citation Count: 1
|
|
|
ABSTRACT
In this paper we perform an empirical evaluation of supervised learning on high-dimensional data. We evaluate performance on three metrics: accuracy, AUC, and squared loss and study the effect of increasing dimensionality on the performance of the learning algorithms. Our findings are consistent with previous studies for problems of relatively low dimension, but suggest that as dimensionality increases the relative performance of the learning algorithms changes. To our surprise, the method that performs consistently well across all dimensions is random forests, followed by neural nets, boosted trees, and SVMs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
Genkin, A., Lewis, D., & Madigan, D. (2006). Large-scale bayesian logistic regression for text categorization. Technometrics.
|
 |
7
|
|
| |
8
|
King, R., Feng, C., & Shutherland, A. (1995). Statlog: comparison of classification algorithms on large real-world problems. Applied Artificial Intelligence, 9, 259--287.
|
| |
9
|
Le Cun, Y., Bottou, L., Orr, G. B., & Müüller, K.-R. (1998). Effcient backprop. In Neural networks, tricks of the trade, LNCS 1524. Springer Verlag.
|
| |
10
|
LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., Denker, J., Drucker, H., Guyon, I., Muller, U., Sackinger, E., et al. (1995). Comparison of learning algorithms for handwritten digit recognition. International Conference on Artificial Neural Networks, 60.
|
 |
11
|
|
| |
12
|
Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 10.
|
| |
13
|
Provost, F. J., & Fawcett, T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. KDD '97 (pp. 43--48).
|
 |
14
|
|
 |
15
|
|
CITED BY
|
|
Miloš Radovanović , Alexandros Nanopoulos , Mirjana Ivanović, Nearest neighbors in high-dimensional data: the emergence and influence of hubs, Proceedings of the 26th Annual International Conference on Machine Learning, p.865-872, June 14-18, 2009, Montreal, Quebec, Canada
|
|