ACM Home Page
Please provide us with feedback. Feedback
Tensor-CUR decompositions for tensor-based data
Full text PdfPdf (1.81 MB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Philadelphia, PA, USA
SESSION: Research track papers table of contents
Pages: 327 - 336  
Year of Publication: 2006
ISBN:1-59593-339-5
Authors
Michael W. Mahoney  Yahoo Research Labs, Sunnyvale, CA
Mauro Maggioni  Yale University, New Haven, CT
Petros Drineas  Rensselaer Polytechnic Institute, Troy, NY
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 81,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1150402.1150440
What is a DOI?

ABSTRACT

Motivated by numerous applications in which the data may be modeled by a variable subscripted by three or more indices, we develop a tensor-based extension of the matrix CUR decomposition. The tensor-CUR decomposition is most relevant as a data analysis tool when the data consist of one mode that is qualitatively different than the others. In this case, the tensor-CUR decomposition approximately expresses the original data tensor in terms of a basis consisting of underlying subtensors that are actual data elements and thus that have natural interpretation in terms ofthe processes generating the data. In order to demonstrate the general applicability of this tensor decomposition, we apply it to problems in two diverse domains of data analysis: hyperspectral medical image analysis and consumer recommendation system analysis. In the hyperspectral data application, the tensor-CUR decomposition is used to compress the data, and we show that classification quality is not substantially reduced even after substantial data compression. In the recommendation system application, the tensor-CUR decomposition is used to reconstruct missing entries in a user-product-product preference tensor, and we show that high quality recommendations can be made on the basis of a small number of basis users and a small number of product-product comparisons from a new user.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
M. W. Berry, S. A. Pulatova, and G. W. Stewart. Computing sparse reduced-rank approximations to sparse matrices. Technical Report UMIACS TR-2004-32 CMSC TR-4589, University of Maryland, College Park, MD, 2004.
 
4
 
5
J. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the 14th Annual Conference on Uncertainty in Artificial Intelligence, pages 43--52, 1998.
 
6
J. D. Carroll and J. J. Chang. Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckart- Young" decomposition. Psychometrika, 35(3):283--319, 1970.
 
7
 
8
 
9
 
10
11
 
12
P. Drineas and M. W. Mahoney. A randomized algorithm for a tensor-based generalization of the Singular Value Decomposition. To appear in: Linear Algebra and its Applications.
 
13
P. Drineas and M. W. Mahoney. On the Nyström method for approximating a Gram matrix for improved kernel-based learning. Journal of Machine Learning Research, 6:2153--2175, 2005.
 
14
 
15
 
16
G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, 1989.
 
17
S. A. Goreinov and E. E. Tyrtyshnikov. The maximum-volume concept in approximation by low-rank matrices. Contemporary Mathematics, 280:47--51, 2001.
 
18
S. A. Goreinov, E. E. Tyrtyshnikov, and N. L. Zamarashkin. A theory of pseudoskeleton approximations. Linear Algebra and Its Applications, 261:1--21, 1997.
 
19
R. A. Harshman and M. E. Lundy. The PARAFAC model for three-way factor analysis and multidimensional scaling. In H. G. Law, C. W. Snyder Jr., J. Hattie, and R. P. McDonald, editors, Research Methods for Multimode Data Analysis, pages 122--215. Praeger, 1984.
 
20
 
21
R. Jin, L. Si, and C. X. Zhai. Preference-based graphic models for collaborative filtering. In Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence, pages 329--336, 2003.
22
23
 
24
P. M. Kroonenberg and J. De Leeuw. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika, 45(1):69--97, 1980.
 
25
J. B. Kruskal. Three-way arrays: Rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra and Its Applications, 18:95--138, 1977.
 
26
 
27
 
28
 
29
 
30
D. Leibovici and R. Sabatier. A singular value decomposition of a k-way array for a principal component analysis of multiway data, PTA-k. Linear Algebra and Its Applications, 269:307--329, 1998.
 
31
L.-H. Lim and G. H. Golub. Tensors for numerical multilinear algebra: ranks and basic decompositions. Technical Report 05-02, Stanford University SCCM, Stanford, CA, April 2005.
 
32
 
33
M. Maggioni, G. L. Davis, F. J. Warner, F. B. Geshwind, A. C. Coppi, R. A. DeVerse,and R. R. Coifman. Algorithms from signal and data processing applied to hyperspectral analysis: Discriminating normal and malignant microarray colon tissue sections using a novel digital mirror device system. Manuscript, 2006.
 
34
M. Maggioni, G. L. Davis, F. J. Warner, F. B. Geshwind, A. C. Coppi, R. A. DeVerse,and R. R. Coifman. Hyperspectral microscopic analysis of normal, benign and carcinoma microarray tissue sections. Manuscript, 2006.
35
 
36
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Application of dimensionality reduction in recommender system - a case study. In Proceedings of the WebKDD 2000 Workshop, pages 000--000, 2000.
 
37
G. W. Stewart. Four algorithms for the efficient computation of truncated QR approximations to a sparse matrix. Numerische Mathematik, 83:313--323, 1999.
 
38
G. W. Stewart. Error analysis of the quasi-Gram-Schmidt algorithm. Technical Report UMIACS TR-2004-17 CMSC TR-4572, University of Maryland, College Park, MD, 2004.
 
39
L. R. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3):279--311, 1966.
 
40
C. K. I. Williams and M. Seeger. Using the Nyström method to speed up kernel machines. In Annual Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, pages 682--688, 2001.


Collaborative Colleagues:
Michael W. Mahoney: colleagues
Mauro Maggioni: colleagues
Petros Drineas: colleagues