| Discovery in multi-attribute data with user-defined constraints |
| Full text |
Pdf
(510 KB)
|
| Source
|
ACM SIGKDD Explorations Newsletter
archive
Volume 4 , Issue 1 (June 2002)
table of contents
COLUMN: Constraints in data mining
table of contents
Pages: 56 - 64
Year of Publication: 2002
ISSN:1931-0145
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 2, Downloads (12 Months): 14, Citation Count: 4
|
|
|
ABSTRACT
There has been a growing interest in mining frequent itemsets in relational data with multiple attributes. A key step in this approach is to select a set of attributes that group data into transactions and a separate set of attributes that labels data into items. Unsupervised and unrestricted mining, however, is stymied by the combinatorial complexity and the quantity of patterns as the number of attributes grows. In this paper, we focus on leveraging the semantics of the underlying data for mining frequent itemsets. For instance, there are usually taxonomies in the data schema and functional dependencies among the attributes. Domain knowledge and user preferences often have the potential to significantly reduce the exponentially growing mining space. These observations motivate the design of a user-directed data mining framework that allows such domain knowledge to guide the mining process and control the mining strategy. We show examples of tremendous reduction in computation by using domain knowledge in mining relational data with multiple attributes.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Ramesh C. Agarwal , Charu C. Aggarwal , V. V. V. Prasad, Depth first generation of long patterns, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.108-118, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347114]
|
 |
2
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
3
|
|
 |
4
|
|
| |
5
|
J. Deogun, V. Raghavan, A. Sarkar, and H. Sever. Data mining: Research trends, challenges, and applications, 1997.
|
| |
6
|
H. B. Enderton. A Mathematical Introduction to Logic. Academic Press, 2nd edition, December 2000.
|
| |
7
|
|
| |
8
|
|
 |
9
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
10
|
|
 |
11
|
Raymond T. Ng , Laks V. S. Lakshmanan , Jiawei Han , Alex Pang, Exploratory mining and pruning optimizations of constrained associations rules, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.13-24, June 01-04, 1998, Seattle, Washington, United States
|
| |
12
|
B. Padmanabhan and A. Tuzhilin. Unexpectedness as a measure of interestingness in knowledge discovery, 1999.
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Int'l Conf. on Knowledge Discovery and Data Mining (SIGKDD), pages 67-93, 1997.
|
| |
18
|
A. Tarski. A lattice-theoretical fixpoint theorem and its applications. Pacific J. Math., pages 253-309, 1955.
|
 |
19
|
|
CITED BY 4
|
Daniel Kifer , Johannes Gehrke , Cristian Bucila , Walker White, How to quickly find a witness, Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.272-283, June 09-11, 2003, San Diego, California
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|