| An effective algorithm for mining 3-clusters in vertically partitioned data |
| Full text |
Pdf
(322 KB)
|
Source
|
Conference on Information and Knowledge Management
archive
Proceeding of the 17th ACM conference on Information and knowledge management
table of contents
Napa Valley, California, USA
SESSION: KM: clustering
table of contents
Pages 1103-1112
Year of Publication: 2008
ISBN:978-1-59593-991-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 19, Downloads (12 Months): 176, Citation Count: 0
|
|
|
ABSTRACT
Conventional clustering algorithms group similar data points together along one dimension of a data table. Bi-clustering simultaneously clusters both dimensions of a data table. 3-clustering goes one step further and aims to concurrently cluster two data tables that share a common set of row labels, but whose column labels are distinct. Such clusters reveal the underlying connections between the elements of all three sets. We present a novel algorithm that discovers 3-clusters across vertically partitioned data. Our approach presents two new and important formulations: first we introduce the notion of a 3-cluster in partitioned data; and second we present a mathematical formulation that measures the quality of such clusters. Our algorithm discovers high quality, arbitrarily positioned, overlapping clusters, and is efficient in time. These results are exhibited in a comprehensive study on real datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
G. Liu, K. Sim, and J. Li. Efficient Mining of large maximal bicliques In Dawak pp. 437--448, 2006.
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
Takeaki Uno , Masashi Kiyomi , Hiroki Arimura, LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining, Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, p.77-86, August 21-21, 2005, Chicago, Illinois
[doi> 10.1145/1133905.1133916]
|
 |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
H. Bian and R. Bhatnagar An Algorithm for Lattice-Structured Subspace Clustering. Proceedings of the SIAM International Conference on Data Mining, April 2005.
|
| |
10
|
Yoon, S. Benini , L. De Micheli, G. Co-clustering: A Versatile Tool for Data Analysis in Biomedical Informatics. Information Technology in Biomedicine, IEEE Transactions on. Volume: 11, Issue: 4. pp.493--494.
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
S. Datta, C. Giannella, and H. Kargupta. K-Means Clustering over a Large, Dynamic Network. In Proceedings of 2006 SIAM Conference on Data Mining, Bethesda, MD, April 2006.
|
| |
15
|
H. Dutta, C. Giannella, K. Borne and H. Kargupta. Distributed Top-K Outlier Detection from Astronomy Catalogs using the DEMAC System. Proceedings of the SIAM International Conference on Data Mining, Minneapolis, USA, April 2007
|
| |
16
|
|
| |
17
|
|
| |
18
|
D. K. Tasoulis and M. N. Vrahatis. Unsupervised distributed clustering. In IASTED International Conference on Parallel and Distributed Computing and Networks, pages 347--351. Innsbruck, Austria, 2004.
|
| |
19
|
Waseem Ahmad, Ashfaq Khokhar, Phoenix: Privacy Preserving Biclustering on Horizontally Partitioned Data amid Malicious Adversaries. ACM SIGKDD. San Jose, 2007
|
| |
20
|
Arindam Banerjee, Sugato Basu, Srujana Merugu, "Multi-way Clustering on Relation Graphs." Proceedings of the SIAM International Conference on Data Mining. SDM-2007
|
| |
21
|
|
|