ACM Home Page
Please provide us with feedback. Feedback
Tri-plots: scalable tools for multidimensional data mining
Full text PdfPdf (884 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Francisco, California
Pages: 184 - 193  
Year of Publication: 2001
ISBN:1-58113-391-X
Authors
Agma Traina  University of S. Paulo at S. Carlos, Brazil
Caetano Traina  University of S. Paulo at S. Carlos, Brazil
Spiros Papadimitriou  Carnegie Mellon University
Christos Faloutsos  Carnegie Mellon University
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
AAAI : American Association for Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 27,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502512.502538
What is a DOI?

ABSTRACT

We focus on the problem of finding patterns across two large, multidimensional datasets. For example, given feature vectors of healthy and of non-healthy patients, we want to answer the following questions: Are the two clouds of points separable? What is the smallest/largest pair-wise distance across the two datasets? Which of the two clouds does a new point (feature vector) come from?We propose a new tool, the tri-plot, and its generalization, the pq-plot, which help us answer the above questions. We provide a set of rules on how to interpret a tri-plot, and we apply these rules on synthetic and real datasets. We also show how to use our tool for classification, when traditional methods (nearest neighbor, classification trees) may fail.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Agrawal, J. Gherke, D. Gunopoulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications, 1998.
2
 
3
4
5
 
6
S. Chaudhuri. Data mining and database systems: Where is the intersection? Data Engineering Bulletin, 21(1):4-8, 1998.
 
7
 
8
M. Ester, A. Frommelt, H.-P. Kriegel, and J. Sander. Algorithms for characterization and trend detection in spatial databases. In Proc. of the 4th International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 44-50, 1998.
 
9
10
 
11
U. M. Fayyad. Mining databases - towards algorithms for knowledge discovery. Data Engineering Bulletin, 21(1):39-48, 1998.
 
12
U. M. Fayyad, C. Reina, and P. S. Bradley. Initialization of iterative refinement clustering algorithms. In Proc. of the 4th International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 194-198, 1998.
 
13
 
14
 
15
C. Traina Jr., A. Traina, L. Wu, and C. Faloutsos. Fast feature selection using the fractal dimension. In XV Brazilian Symposium on Databases (SBBD), 2000.
 
16
 
17
D. A. Keim and H.-P. Kriegel. Possibilities and limits in visualizing large amounts of multidimensional data. In Perceptual Issues in Visualization. Springer, 1994.
 
18
 
19
 
20
Bureau of Census. Tiger/line preeensus files: 1990 technical documentation. Bureau of the Census. Washington, DC, 1989.
 
21
 
22
M. Schroeder. Fractals, Chaos, Power Laws. W.H. Freeman and Company, New York, 1991.
 
23
H. G. Schuster. Deterministic Chaos. VCH Publisher, Weinheim, Basel, Cambridge, New York, 1988.
 
24
25
 
26


Collaborative Colleagues:
Agma Traina: colleagues
Caetano Traina: colleagues
Spiros Papadimitriou: colleagues
Christos Faloutsos: colleagues