ACM Home Page
Please provide us with feedback. Feedback
Heidi matrix: nearest neighbor driven high dimensional data visualization
Full text PdfPdf (5.55 MB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration table of contents
Paris, France
Pages 83-92  
Year of Publication: 2009
ISBN:978-1-60558-670-0
Authors
Soujanya Vadapalli  Centre for Data Engineering, IIIT-Hyderabad, India
Kamalakar Karlapalem  Centre for Data Engineering, IIIT-Hyderabad, India
Sponsors
: PASCAL2 - Pattern Analysis, Statistical Modelling and Computational Learning
: Helsinki Institute for Information Technology HIIT
: VisMaster, a European FP7 Coordination Action Project focused on Visual Analytics
: Danube University Krems, Departement of Information and Knowledge Engineering (DUK)
: National Visualization and Analytics Center (NVAC)
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 29,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1562849.1562859
What is a DOI?

ABSTRACT

Identifying patterns in large high dimensional data sets is a challenge. As the number of dimensions increases, the patterns in the data sets tend to be more prominent in the subspaces than the original dimensional space. A system to facilitate presentation of such subspace oriented patterns in high dimensional data sets is required to understand the data.

Heidi is a high dimensional data visualization system that captures and visualizes the closeness of points across various subspaces of the dimensions; thus, helping to understand the data. The core concept behind Heidi is based on prominence of patterns within the nearest neighbor relations between pairs of points across the subspaces.

Given a d-dimensional data set as input, Heidi system generates a 2-D matrix represented as a color image. This representation gives insight into (i) how the clusters are placed with respect to each other, (ii) characteristics of placement of points within a cluster in all the subspaces and (iii) characteristics of overlapping clusters in various subspaces.

A sample of results displayed and discussed in this paper illustrate how Heidi Visualization can be interpreted.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
U. Axen and H. Edelsbrunner. Auditory morse analysis of triangulated manifolds. In Mathematical Visualization, pages 223--236. Springer-Verlag, 1998.
 
5
 
6
C.-H. Chen. Generalized association plots for information visualization: The applications of the convergence of iteratively formed correlation matrices. volume 12, pages 1--23. Statistica Sinica, 2002.
 
7
M. Ester, H. P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. ACM SIGMOD, pages 226--231, 1996.
 
8
K. Kailing, H. P. Kriegel, and P. Kroger. Density-connected subspace clustering for high-dimensional data. In Proc. ICDM, 2004.
 
9
 
10
J. Vennam and S. Vadapalli. Syndeca: Synthetic generation of datasets to evaluate clustering algorithms. In COMAD, 2005.

Collaborative Colleagues:
Soujanya Vadapalli: colleagues
Kamalakar Karlapalem: colleagues