|
ABSTRACT
A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers. Examples include linear Support Vector Machines (SVMs), Logistic Regression, Conditional Random Fields (CRFs), and Lasso amongst others. This paper describes the theory and implementation of a highly scalable and modular convex solver which solves all these estimation problems. It can be parallelized on a cluster of workstations, allows for data-locality, and can deal with regularizers such as l1 and l2 penalties. At present, our solver implements 20 different estimation problems, can be easily extended, scales to millions of observations, and is up to 10 times faster than specialized solvers for many applications. The open source code is freely available as part of the ELEFANT toolbox.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
S. Balay, K. Buschelman, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, B. F. Smith, and H. Zhang. PETSc users manual. Technical Report ANL-95/11, Argonne National Laboratory, 2006.
|
| |
3
|
O. E. Barndorff-Nielsen. Information and Exponential Families in Statistical Theory. John Wiley and Sons, New York, 1978.
|
| |
4
|
K. P. Bennett and O. L. Mangasarian. Robust linear programming discrimination of two linearly inseparable sets. Optim. Methods Softw., 1:23--34, 1992.
|
| |
5
|
S. Benson, L. Curfman-McInnes, J. Moré, and J. Sarich. TAO user manual. Technical Report ANL/MCS-TM-242, Argonne National Laboratory, 2004.
|
| |
6
|
|
 |
7
|
|
| |
8
|
E. Candes and T. Tao. Decoding by linear programming. IEEE Trans. Info Theory, 51(12):4203--4215, 2005.
|
| |
9
|
C. Chang and C. Lin. LIBSVM: a library for support vector machines, 2001.
|
| |
10
|
O. Chapelle. Training a support vector machine in the primal. Technical Report TR.147, Max Planck Institute for Biological Cybernetics, 2006.
|
| |
11
|
C. Chu, S. Kim, Y. A. Lin, Y. Y. Yu, G. Bradski, A. Ng, and K. Olukotun. Map-reduce for machine learning on multicore. In NIPS 19, 2007.
|
 |
12
|
|
| |
13
|
|
| |
14
|
Robert G. Cowell , Steffen L. Lauritzen , A. Philip David , David J. Spiegelhalter , V. Nair , J. Lawless , M. Jordan, Probabilistic Networks and Expert Systems, Springer-Verlag New York, Inc., Secaucus, NJ, 1999
|
| |
15
|
|
| |
16
|
N. A. C. Cressie. Statistics for Spatial Data. John Wiley and Sons, New York, 1993.
|
| |
17
|
L. Fahrmeir and G. Tutz. Multivariate Statistical Modelling Based on Generalized Linear Models. Springer, 1994.
|
| |
18
|
S. Fine and K. Scheinberg. Efficient SVM training using low-rank kernel representation. Technical report, IBM Watson Research Center, New York, 2000.
|
| |
19
|
|
| |
20
|
|
| |
21
|
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In A. J. Smola, P. L. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 115--132, Cambridge, MA, 2000. MIT Press.
|
| |
22
|
J. Hiriart-Urruty and C. Lemaréchal. Convex Analysis and Minimization Algorithms, I and II. 305 and 306. Springer-Verlag, 1993.
|
| |
23
|
|
 |
24
|
|
 |
25
|
|
| |
26
|
|
| |
27
|
R. Koenker. Quantile Regression. Cambridge University Press, 2005.
|
| |
28
|
|
| |
29
|
Q. Le and A. Smola. Direct optimization of ranking measures. JMLR, 2007. submitted.
|
| |
30
|
O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Oper. Res., 13:444--452, 1965.
|
| |
31
|
Klaus-Robert Müller , Alex J. Smola , Gunnar Rätsch , Bernhard Schölkopf , Jens Kohlmorgen , Vladimir Vapnik, Predicting Time Series with Support Vector Machines, Proceedings of the 7th International Conference on Artificial Neural Networks, p.999-1004, October 08-10, 1997
|
| |
32
|
B. Schölkopf, J. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson. Estimating the support of a high-dimensional distribution. TR 87, Microsoft Research, Redmond, WA, 1999.
|
| |
33
|
|
| |
34
|
|
| |
35
|
S. Shalev-Shwartz and Y. Singer. Online learning optimization in the dual. In COLT, 2006. extended
|
 |
36
|
|
| |
37
|
|
| |
38
|
B. Taskar, C. Guestrin, and D. Koller. Max-margin networks. In NIPS, pages 25--32, 2004.
|
| |
39
|
R. Tibshirani. Regression shrinkage and selection via lasso. J. R. Stat. Soc. Ser. B Stat. Methodol., 58:267--288 1996.
|
| |
40
|
|
| |
41
|
V. Vapnik, S. Golowich, and A. J. Smola. Support method for function approximation, regression estimation, and signal processing. In NIPS, pages 281--287, 1997.
|
| |
42
|
S. V. N. Vishwanathan and A. J. Smola. Fast kernels string and tree matching. In NIPS, pages 569--576, 2003
|
| |
43
|
|
CITED BY 12
|
|
Novi Quadrianto , Alex J. Smola , Tiberio S. Caetano , Quoc V. Le, Estimating labels from label proportions, Proceedings of the 25th international conference on Machine learning, p.776-783, July 05-09, 2008, Helsinki, Finland
|
|
|
|
|
|
|
|
|
|
|
|
Jin Yu , S. V. N. Vishwanathan , Simon Günter , Nicol N. Schraudolph, A quasi-Newton approach to non-smooth convex optimization, Proceedings of the 25th international conference on Machine learning, p.1216-1223, July 05-09, 2008, Helsinki, Finland
|
|
|
S. Sathiya Keerthi , S. Sundararajan , Kai-Wei Chang , Cho-Jui Hsieh , Chih-Jen Lin, A sequential dual method for large scale multi-class linear svms, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
Jun Zhu , Zaiqing Nie , Xiaojiang Liu , Bo Zhang , Ji-Rong Wen, StatSnowball: a statistical approach to extracting entity relationships, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
Chuong B. Do , Quoc V. Le , Chuan-Sheng Foo, Proximal regularization for online and batch learning, Proceedings of the 26th Annual International Conference on Machine Learning, p.257-264, June 14-18, 2009, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|