ACM Home Page
Please provide us with feedback. Feedback
Making every bit count: fast nonlinear axis scaling
Full text PdfPdf (573 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Edmonton, Alberta, Canada
POSTER SESSION: Poster papers table of contents
Pages: 664 - 669  
Year of Publication: 2002
ISBN:1-58113-567-X
Authors
Leejay Wu  Carnegie Mellon University
Christos Faloutsos  Carnegie Mellon University
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
: AAAI
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 18,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/775047.775146
What is a DOI?

ABSTRACT

Existing axis scaling and dimensionality methods focus on preserving structure, usually determined via the Euclidean distance. In other words, they inherently assume that the Euclidean distance is already correct. We instead propose a novel nonlinear approach driven by an information-theoretic viewpoint, which we show is also strongly linked to intrinsic dimensionality, or degrees of freedom; and uniformity. Nonlinear transformations based on common probability distributions, combined with information-driven selection, simultaneously reduce the number of dimensions required and increase the value of those we retain. Experiments on real data confirm that this approach reveals correlations, finds novel attributes, and scales well.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
C. Blake and C. Merz. UCI repository of machine learning databases, 1998.
 
5
L. Breiman, J. H. Freidman, R. A. Olshen, and C. J. Stone. CART: Classification and Regression Trees. Chapman & Hall / CRC Press, 1984.
 
6
Central Intelligence Agency, editor. The World Factbook. U.S. Government Printing Office, 1992. http://www.cia.gov/cia/publications/factbook/.
 
7
Central Intelligence Agency, editor. The World Factbook. U.S. Government Printing Office, 2001. http://www.cia.gov/cia/publications/factbook/.
 
8
 
9
K. Chang and J. Ghosh. Principal curves for nonlinear feature extraction and classification. SPIE Applications of Artificial Neural Networks in Image Processing III, 3307:120--129, 1998.
 
10
11
 
12
A. Hyvärinen. Survey on independent component analysis. Neural Computing Surveys, 2:94--128, 1999.
 
13
A. Hyvärinen, J. Karunen, and E. Oja. Independent Component Analysis. John Wiley & Sons, 2001.
 
14
N. Johnson and S. Kotz. Continuous univariate distributions. Houghton Mifflin, 1970.
 
15
I. T. Jolliffe. Principal Components Analysis. Springer-Verlag, New York, 1986.
16
 
17
T. Kohonen. The self-organizing map. In Proceedings of the IEEE, volume 78, 1990.
 
18
S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290, December 2000.
 
19
 
20
G. Schuster. Deterministic Chaos an Introduction. Verlagsgesellschaft, Weinheim, Germany, 3rd edition, 1995.
 
21
C. Shannon. A mathematical theory of communcation. Bell Systems Technical Journal, 1948.
 
22
J. B. Tenenbaum, V. de Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. 290:2319--2322, December 2000.
 
23
C. Traina Jr, A. Traina, L. Wu, and C. Faloutsos. Fast feature selection using fractal dimension. Simpósio Brasileiro de Banco de Dados, Oct. 2000.


Collaborative Colleagues:
Leejay Wu: colleagues
Christos Faloutsos: colleagues