ACM Home Page
Please provide us with feedback. Feedback
Shrinkage estimator generalizations of Proximal Support Vector Machines
Full text PdfPdf (1.14 MB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Edmonton, Alberta, Canada
SESSION: Statistical methods II table of contents
Pages: 173 - 182  
Year of Publication: 2002
ISBN:1-58113-567-X
Author
Deepak K. Agarwal  AT&T Labs-Research, Florham Park, NJ
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
: AAAI
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 50,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/775047.775073
What is a DOI?

ABSTRACT

We give a statistical interpretation of Proximal Support Vector Machines (PSVM) proposed at KDD2001 as linear approximaters to (nonlinear) Support Vector Machines (SVM). We prove that PSVM using a linear kernel is identical to ridge regression, a biased-regression method known in the statistical community for more than thirty years. Techniques from the statistical literature to estimate the tuning constant that appears in the SVM and PSVM framework are discussed. Better shrinkage strategies that incorporate more than one tuning constant are suggested. For nonlinear kernels, the minimization problem posed in the PSVM framework is equivalent to finding the posterior mode of a Bayesian model defined through a Gaussian process on the predictor space. Apart from providing new insights, these interpretations help us attach an estimate of uncertainty to our predictions and enable us to build richer classes of models. In particular, we propose a new algorithm called PSVMMIX which is a combination of ridge regression and a Gaussian process model. Extension to the case of continuous response is straightforward and illustrated with example datasets.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
US census bureau. Adult dataset. Publicly available from: www.sgi.com/Technology/mlc/db.
 
2
D. Agarwal. Bayesian spatial regression analysis with large datasets, ph.d dissertation, university of connecticut, www.research.att.com/~dagarwal. 2001.
 
3
J. Albert and S. Chib. Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88:669--679, 1993.
 
4
H. Arthur and K. Robert. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12:55--67, 1970.
 
5
 
6
D. DeCoste. Visualizing mercer kernel feature spaces via kernelized locally-linear embeddings. In The 8th International Conference on Neural Information Processing, 2001.
 
7
A. Dempster, M. Schatzoff, and N. Wermuth. A simulation study of alternatives to ordinary least squares. Journal of the American Statistical Association, 72:77--106, 1977.
 
8
H. Drucker, C. J. Burges, L. Kaufman, A. Smola, and V. Vapnik. Support vector regression machines. In Michael C. Mozer, Michael L Jordan, and Thomas Petsche editors, Advances in Neural Information Processing Systems -9-, pages 155--161. The MIT Press, Cambridge, MA, 1997.
 
9
K. Duan, S. Keerthi, and A. Poo. Evaluation of simple performance measures for tuning SVM hyperparameters. Technical Report, Department of Mechanical Engineering, National University of Singapore, 2001.
10
 
11
B. Efron and C. Morris. Data analysis using stein's estimator and its generalizations. Journal of the American Statistical Association, 70:311--319, 1975.
 
12
J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 38(2):337--374, April 2000.
13
14
 
15
A. Gelfand and A. Smith. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association, 85:398--409, 1990.
 
16
 
17
D. Gibbons. A simulation study of some ridge estimators. Journal of the American Statistical Association, 76:131--139, 1981.
 
18
 
19
T. Hsaing. A bayesian view on ridge regression. The Statistician, 24:267--268, 1975.
 
20
S. le Cessie and J. van Houwelingen. Ridge estimators in logistic regression. Applied Statistics, 41:191--201, 1992.
 
21
A. Lee and M. Silvapulle. Ridge estimation in logistic regression. Communications in Statistics, Part B--Simulation and Computation, 17:1231--1257, 1988.
 
22
 
23
M. J. Mackinnon and M. L. Puterman. Collinearity in generalized linear models. Communications in Statistics, Part A -- Theory and Methods, 18:3463--3472, 1989.
 
24
O. Mangasarian. Generalized support vector machines. In A. Smola, P. Bartlett, B.Schölkopf, and D.Schuurmans, editors, Advances in Large Margin Classifiers, pages 135--146. The MIT Press, Cambridge, MA, 2000.
 
25
B. Marx, P. Eilers, and E. Smith. Ridge likelihood estimation for generalized linear regression. In P. van der Heijden, W. Jensen, B. Francis, and G. Seeber, editors, Statistical Modeling, pages 227--238. North Holland Publishing Company (Elseviers), Amsterdam, 1992.
 
26
P. Murphy and D. Aha. UCI repository of machine learning databases. www.ics.uci.edu/~mlearn/MLRepository.html.
 
27
B. Segerstedt. On ordinary ridge regression in generalized linear models. Communications in Statistics, Part A -- Theory and Methods, 21:2227--2246, 1992.
 
28
G. Smith and F. Campbell. A critique of some ridge regression methods (with discussion). Journal of the American Statistical Association, 75:74--81, 1980.
 
29
D. Stewart and Z. Leyk. Meschach: Matrix computations in C. www.netlib.org/c/meschach/.
 
30
R. Thisted. Comments on 'a critique of some ridge regression methods'. Journal of the American Statistical Association, 75:81--86, 1980.