ACM Home Page
Please provide us with feedback. Feedback
Very sparse stable random projections for dimension reduction in lα (0 <α ≤ 2) norm
Full text PdfPdf (843 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Jose, California, USA
SESSION: Research track papers table of contents
Pages: 440 - 449  
Year of Publication: 2007
ISBN:978-1-59593-609-7
Author
Ping Li  Stanford University
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 116,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1281192.1281241
What is a DOI?

ABSTRACT

The method of stable random projections is a useful tool for efficiently computing the lα (0 < α ≤ 2) norms and distances in massive data in one pass. Consider a data matrix A ∈RnxD. If we multiply A with a projection matrix R ΕR Dxk (k« D),whose entries are i.i.d. samples of an α-stable distribution,then the projected matrix B = Ax R Ε R nxkx containsenough information to approximately recover the l α properties in A.

We propose very sparse stable random projections, by replacing the α stable distribution with a (much simpler) mixture of a symmetric α Pareto distribution (with probability Β, 0 β Β 1) and a point mass at the origin(with probability 1-Β). This leads to a significant 1 over Β fold speedup for small Β when computing B = AxR and a 1 over Β-fold cost reduction in storing R}. By analyzing the convergence, we show that in"reasonable" datasets Β often can be very small (e.g.,D1/2 without hurting the estimation accuracy. Some numerical evaluations are conducted, on synthetic data, Web crawldata, and gene expression microarray data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
D. Achlioptas, F. McSherry, and B. Schölkopf. Sampling techniques for kernel methods. In NIPS, 335--342, 2001.
 
4
5
6
 
7
 
8
9
 
10
M. Balcan, A. Blum, and S. Vempala. On kernels, margins, and low-dimensional mappings. In ALT, 194--205, 2004.
 
11
A. Bhattacharjee, W. G. Richards, J. Staunton, C. Li, S. Monti, P. Vasa, C. Ladd, J. Beheshti, R. Bueno, M. Gillette, M. Loda, G. Weber, E. J. Mark, E. S. Lander, W. Wong, B. E. Johnson, T. R. Golub, D. J. Sugarbaker, and M. Meyerson. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. PNAS, 98(24):13790--13795, 2001.
12
 
13
14
 
15
 
16
J. Buhler and M. Tompa. Finding motifs using random projections. Journal of Computational Biology, 9(2):225--242, 2002.
 
17
O. Chapelle, P. Haffner, and V. N. Vapnik. Support vector machines for histogram-based image classification. IEEE Trans. Neural Networks, 10(5):1055--1064, 1999.
18
 
19
 
20
 
21
G. Cormode and S. Muthukrishnan. Estimating dominance norms of multiple data streams. In ESA, 148--160, 2003.
 
22
 
23
 
24
D. L. Donoho. Unconditional bases are optimal bases for data compression and for statistical estimation. Applied and Computational Harmonic Analysis, 1(1):100--115, 1993.
 
25
D. L. Donoho. Compressed sensing. IEEE Trans. Inform. Theory, 52(4):1289--1306, 2006.
 
26
S. T. Dumais. Improving the retrieval of information from external sources. Behavior Research Methods, Instruments and Computers, 23(2):229--236, 1991.
 
27
R. Durrett. Probability: Theory and Examples. Duxbury Press, second edition, 1995.
 
28
B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. The Annals of Statistics, 32(2):407--499, 2004.
 
29
M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the Internet topology. In SIGMOD, 251--262, 1999.
 
30
E. F. Fama and R. Roll. Some properties of symmetric stable distributions. Journal of the American Statistical Association, 63(323):817--836, 1968.
 
31
E. F. Fama and R. Roll. Parameter estimates for symmetric stable distributions. Journal of the American Statistical Association, 66(334):331--338, 1971.
 
32
33
 
34
X. Z. Fern and C. E. Brodley. Random projection for high dimensional data clustering: A cluster ensemble approach. In ICML, 186--193, 2003.
 
35
J. H. Fong and M. Strauss. An approximate lp difference algorithm for massive data streams. Discrete Mathematics & Theoretical Computer Science, 4(2):301--322, 2001.
36
 
37
 
38
B. V. Gnedenko and A. N. Kolmogorov. Limit Distributions for Sum of Independent Random Variables. Addison Wesley, 1954.
 
39
N. Goel, G. Bebis, and A. Nefian. Face recognition experiments with random projection. In SPIE, 42---437, 2005.
 
40
I. S. Gradshteyn and I. M. Ryzhik. Table of Integrals, Series, and Products. Academic Press, fifth edition, 1994.
41
 
42
M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. American Mathematical Society, 1999.
 
43
 
44
45
46
 
47
W. B. Johnson and J. Lindenstrauss. Extensions of Lipschitz mapping into Hilbert space. Contemporary Mathematics, 26:189--206, 1984.
 
48
W. B. Johnson and G. Schechtman. Embedding lp into l1. Acta. Math., 149:71--85, 1982.
 
49
R. Kuske and J. B. Keller. Rate of convergence to a stable law. SIAM Journal of Applied Mathematics, 61(4):1308--1323, 2000.
50
 
51
J. R. Lee and A. Naor. Embedding the diamond graph in lp and dimension reduction in l1. Geometric And Functional Analysis, 14(4):745--747, 2004.
 
52
 
53
 
54
H. C. M. Leung, F. Y. L. Chin, S. M. Yiu, R. Rosenfeld, and W. W. Tsang. Finding motifs with insufficient number of strong binding sites. Journal of Computational Biology, 12(6):686--701, 2005.
 
55
P. Li. Very sparse stable random projections, estimators and tail bounds for stable random projections. http://arxiv.org/PS_cache/cs/pdf/0611/0611114v2.pdf, 2006.
 
56
P. Li. Estimators and tail bounds for dimension reduction in lα (0 < α ≤ 2) using stable random projections. Technical report, Department of Statistics, Stanford University, http://www.stanford.edu/~pingli98/publications/stable.pdf, 2007.
 
57
 
58
 
59
P. Li, K. W. Church, and T.J. Hastie. Conditional random sampling: A sketch-based sampling technique for sparse data. In NIPS, 2007.
 
60
P. Li, T. J. Hastie, and K. W. Church. Improving random projections using marginal information. In COLT, 635--649, 2006.
61
 
62
P. Li, T. J. Hastie, and K. W. Church. Nonlinear estimators and tail bounds for dimensional reduction in l1 using Cauchy random projections. In COLT, 2007.
 
63
 
64
J. Lin and D. Gunopulos. Dimensionality reduction by random projection and latent semantic indexing. In SDM, 2003.
65
 
66
 
67
 
68
J. Huston McCulloch. Simple consistent estimators of stable distribution parameters. Communications on Statistics-Simulation, 15(4):1109--1136, 1986.
 
69
M. E. J. Newman. Power laws, Pareto distributions and Zipf's law. Contemporary Physics, 46(5):232--351, 2005.
70
 
71
 
72
J. D. Rennie, L. Shih, J. Teevan, and D. R. Karger. Tackling the poor assumptions of naive Bayes text classifiers. In ICML, 616--623, 2003.
 
73
 
74
G. Samorodnitsky and M. S. Taqqu. Stable Non-Gaussian Random Processes. Chapman & Hall, 1994.
 
75
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society B, 58(1):267--288, 1996.
 
76
 
77
S. Vempala. The Random Projection Method. American Mathematical Society, 2004.
78
 
79
J. Zhu, S. Rosset, T. J. Hastie, and R, Tibshirani. 1-norm support vector machines. In NIPS, 2003.
 
80
V. M. Zolotarev. One-dimensional Stable Distributions. American Mathematical Society, 1986.