| Clustering transactions using large items |
| Full text |
Pdf
(928 KB)
|
| Source
|
Conference on Information and Knowledge Management
archive
Proceedings of the eighth international conference on Information and knowledge management
table of contents
Kansas City, Missouri, United States
Pages: 483 - 490
Year of Publication: 1999
ISBN:1-58113-146-1
|
|
Authors
|
|
Ke Wang
|
School of Computing, National University of Singapore
|
|
Chu Xu
|
School of Computing, National University of Singapore
|
|
Bing Liu
|
School of Computing, National University of Singapore
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 58, Citation Count: 26
|
|
|
ABSTRACT
In traditional data clustering, similarity of a cluster of objects is measured by pairwise similarity of objects in that cluster. We argue that such measures are not appropriate for transactions that are sets of items. We propose the notion of large items, i.e., items contained in some minimum fraction of transactions in a cluster, to measure the similarity of a cluster of transactions. The intuition of our clustering criterion is that there should be many large items within a cluster and little overlapping of such items across clusters. We discuss the rationale behind our approach and its implication on providing a better solution to the clustering problem. We present a clustering algorithm based on the new clustering criterion and evaluate its effectiveness.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
Andrei Z. Broder , Steven C. Glassman , Mark S. Manasse , Geoffrey Zweig, Syntactic clustering of the Web, Selected papers from the sixth international conference on World Wide Web, p.1157-1166, September 1997, Santa Clara, California, United States
|
 |
3
|
Douglass R. Cutting , David R. Karger , Jan O. Pedersen , John W. Tukey, Scatter/Gather: a cluster-based approach to browsing large document collections, Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, p.318-329, June 21-24, 1992, Copenhagen, Denmark
[doi> 10.1145/133160.133214]
|
| |
4
|
|
| |
5
|
|
| |
6
|
M. Ester, H-P Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. KDD 1996
|
| |
7
|
S. Guba, R. Rastogi, K. Shim. A clustering algorithm for categorical attributes. ICDE 1999
|
| |
8
|
E.H. Hart, G. Karypis, V. Kumar and B. Mobasher. Clustering based on association rule hypergraphs. SIGMOD workshop on research issues on Data Mining and Knowledge Discovery, 1997
|
| |
9
|
|
| |
10
|
L. Kaufman and P.J. Rousseeuw. Finding groups in data: an introduction to cluster analysis, John Wiley 8c Son, 1990
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
O. Zamir, O. Etzioni, O. Madani and R. M. Karp. Fast and intuitive clustering of web documents. KDD 1997, 287-290
|
 |
16
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
CITED BY 26
|
|
Bing Liu , Yiyuan Xia , Philip S. Yu, Clustering through decision tree construction, Proceedings of the ninth international conference on Information and knowledge management, p.20-29, November 06-11, 2000, McLean, Virginia, United States
|
|
|
|
|
|
|
|
|
Mohammed J. Zaki , Markus Peters , Ira Assent , Thomas Seidl, CLICKS: an effective algorithm for mining subspace clusters in categorical datasets, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
Xifeng Yan , Hong Cheng , Jiawei Han , Dong Xin, Summarizing itemset patterns: a profile-based approach, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Qiaozhu Mei , Dong Xin , Hong Cheng , Jiawei Han , ChengXiang Zhai, Generating semantic annotations for frequent patterns with context analysis, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yabo Xu , Ke Wang , Benyu Zhang , Zheng Chen, Privacy-enhancing personalized web search, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|