|
ABSTRACT
The value of knowledge obtainable by analysing large quantities of data is widely acknowledged. However, so-called primary or raw data may not always be available for knowledge discovery for several reasons. First, cooperating institutions that are interested in sharing knowledge may not be willing (or allowed) to disclose their primary data. Second, data in the form of streams are only temporarily available for processing. If stored at all, stream data are maintained in the form of synopses or derived, abstract representations of the original data. Finally, even for non-stream data, there are limits on the computation speed to be achieved -- such limits are set by hardware and firmware technologies. This problem can only be partially solved through parallelization and increased processing power. Ultimately, in many cases data must be summarized to be processed efficiently. In the light of these observations, we anticipate the need for defining and practising data mining without the luxury of primary data. To that end, we formally introduce the paradigm of Higher Order Mining as a form of data mining that is applied over non-primary, derived data or patterns. Although Higher Order Mining is a new paradigm, there are already research advances on knowledge discovery methods from patterns rather than data. We discuss them and organize them under the light of the new paradigm. We show that the HOM paradigm reveals further potential for knowledge discovery, including the delivery of rules and patterns with semantics that are closer to human intuition and are thus more appropriate for human inspection.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
C. C. Aggarwal and P. S. Yu. A Framework for Clustering Massive Text and Categorical Data Streams. In Proceedings of the SIAM conference on Data Mining 2006, April 2006.
|
| |
3
|
|
 |
4
|
Rakesh Agrawal , Johannes Gehrke , Dimitrios Gunopulos , Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.94-105, June 01-04, 1998, Seattle, Washington, United States
|
| |
5
|
|
 |
6
|
|
| |
7
|
W.-H. Au and K. C. Chan. Mining changes in association rules: a fuzzy approach. Fuzzy Sets and Systems, 149(1):87--104, 2005.
|
| |
8
|
S. Baron, M. Spiliopoulou, and O. Günther. Efficient monitoring of patterns in data mining environments. In Proc. of 7th East-European Conf. on Advances in Databases and Inf. Sys. (ADBIS'03), LNCS, pages 253--265. Springer, Sept. 2003.
|
| |
9
|
Ilaria Bartolini , Paolo Ciaccia , Irene Ntoutsi , Marco Patella , Yannis Theodoridis, A unified and flexible framework for comparing simple and complex patterns, Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, p.496-499, September 20-24, 2004, Pisa, Italy
|
| |
10
|
Francesco Bonchi , Fosca Giannotti , Claudio Lucchese , Salvatore Orlando , Raffaele Perego , Roberto Trasarti, ConQueSt: a Constraint-based Querying System for Exploratory Pattern Discovery, Proceedings of the 22nd International Conference on Data Engineering, p.159, April 03-07, 2006
[doi> 10.1109/ICDE.2006.42]
|
| |
11
|
|
| |
12
|
T. Calders, B. Goethals, and A. Prado. Integrating pattern mining in relational databases. In International Conference Principles of Data Mining and Knowledge Discovery, PKDD'06, pages 454--461. Springer, 2006.
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
E. Christodoulou, T. Dalamagas, and T. Sellis. Navimoz: Mining navigational patterns in portal catalogs. In 2nd International Workshop on Pattern Representation and Management (PaRMa'06), Munich, Germany, 2006.
|
| |
18
|
|
| |
19
|
|
| |
20
|
G. Das, K.-I. Lin, H. Mannila, G. Renganathan, and P. Smyth. Rule discovery from time series. In 4th International Conference on Knowledge Discovery and Data Mining (KDD-98). AAAI Press, 1998.
|
| |
21
|
V. Estivill-Castro, L. Brankovic, and D. Dowe. Privacy in data mining. Privacy - Law and Policy Reporter, 6 (3):33--35, 1999.
|
 |
22
|
|
| |
23
|
|
 |
24
|
|
| |
25
|
A. Fu, M. H. Wong, S. C. Sze, W. C. Wong, W. L. Wong, and W. K. Yu. Finding fuzzy sets for the mining of fuzzy association rules for numerical attributes. In 1st International Symposium on Intelligent Data Engineering and Learning (IDEAL98), pages 263--268, Shatin, Hong Kong, China, 1998.
|
 |
26
|
|
 |
27
|
Venkatesh Ganti , Johannes Gehrke , Raghu Ramakrishnan, CACTUS—clustering categorical data using summaries, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.73-83, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312201]
|
 |
28
|
Venkatesh Ganti , Johannes Gehrke , Raghu Ramakrishnan, A framework for measuring changes in data characteristics, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.126-137, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
[doi> 10.1145/303976.303989]
|
| |
29
|
|
| |
30
|
J. Gehrke, editor. Special Issue on Privacy and Security, volume 4 of SigKDD Explorations. ACM, 2002.
|
 |
31
|
|
| |
32
|
|
| |
33
|
G. K. Gupta, A. Strehl, and J. Ghosh. Distance based clustering of association rules. In Intelligent Engineering Systems Through Artificial Neural Networks, ANNIE 1999, pages 759--764, St. Louis, Missouri, USA, 1999. ASME.
|
| |
34
|
|
| |
35
|
|
| |
36
|
P. Kalnis, N. Mamoulis, and S. Bakiras. On discovering moving clusters in spatio-temporal data. In 9th International Symposium on Spatial and Temporal Databases (SSTD2005), volume 3633 of LNCS, page 364381, Angra dos Reis, 2005. Springer.
|
| |
37
|
|
 |
38
|
|
 |
39
|
|
 |
40
|
Mika Klemettinen , Heikki Mannila , Pirjo Ronkainen , Hannu Toivonen , A. Inkeri Verkamo, Finding interesting rules from large sets of discovered association rules, Proceedings of the third international conference on Information and knowledge management, p.401-407, November 29-December 02, 1994, Gaithersburg, Maryland, United States
[doi> 10.1145/191246.191314]
|
| |
41
|
|
| |
42
|
S. Lee and D. Cheung. Maintenance of discovered association rules: When to update? In R. Ng, editor, ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD'97), Tucson, AZ, USA, 1997. ACM.
|
| |
43
|
|
| |
44
|
|
 |
45
|
|
 |
46
|
|
| |
47
|
|
| |
48
|
B. Liu, W. Hsu, and Y. Ma. Intergrating classification and association rule mining. In 4th International Conference on Knowledge Discovery and Data Mining, KDD-98, pages 80--86, New York City, 1998. AAAI Press.
|
| |
49
|
A. Maddalena and B. Catania. Towards an interoperable solution for pattern management. In 3rd Int. Workshop on Database Interoperability INTERDB'07 (in conjunction with VLDB'07), Vienna, Austria, Sept. 2007.
|
 |
50
|
|
| |
51
|
R. Meo and G. Psaila. An xml-based database for knowledge discovery. In 2nd International Workshop on Pattern Representation and Management (PaRMa'06), Munich, Germany, 2006.
|
 |
52
|
|
| |
53
|
D. Mladenic, W. F. Eddy, and S. Ziolko. Exploratory analysis of retail sales of billions of items. Computer Science and Statistics, 33, 2001.
|
| |
54
|
C. H. Mooney and J. F. Roddick. Mining itemsets - an approach to longitudinal and incremental association rule mining. In A. Zanasi, C. Brebbia, N. Ebecken, and P. Melli, editors, Data Mining III - 3rd International Conference on Data Mining Methods and Databases, pages 93--102, Bologna, Italy, 2002. WIT Press.
|
 |
55
|
Daniel B. Neill , Andrew W. Moore , Maheshkumar Sabhnani , Kenny Daniel, Detection of emerging space-time clusters, Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
[doi> 10.1145/1081870.1081897]
|
| |
56
|
M. Pazzani, S. Mani, and W. R. Shankle. Beyond concise and colorful: learning intelligible rules. In D. Heckerman, H. Mannila, D. Pregibon, and R. Uthurusamy, editors, Third International Conference on Knowledge Discovery and Data Mining, page 235238, Newport Beach, CA, USA, 1997. AAAI Press.
|
| |
57
|
W. Perrizo and A. Denton. Framework unifying association rule mining, clustering and classification. In International Conference on Computer Science, Software Engineering, Information Technology, e-Business, and Applications (CSITeA03), Rio de Janeiro, Brazil, 2003.
|
 |
58
|
|
| |
59
|
A. Prodromidis, P. Chan, and S. Stolfo. Meta-learning in distributed data mining systems: Issues and approaches. In H. Kargupta and P. Chan, editors, Advances in Distributed and Parallel Knowledge Discovery. AAAI press, 2000.
|
| |
60
|
|
| |
61
|
|
| |
62
|
|
| |
63
|
M. Spiliopoulou and J. F. Roddick. Higher order mining: Modelling and mining the results of knowledge discovery. In N. Ebecken and C. Brebbia, editors, Data Mining II - 2nd International Conference on Data Mining Methods and Databases, pages 309--320, Cambridge, UK, 2000. WIT Press.
|
 |
64
|
Myra Spiliopoulou , Irene Ntoutsi , Yannis Theodoridis , Rene Schult, MONIC: modeling and monitoring cluster transitions, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
[doi> 10.1145/1150402.1150491]
|
| |
65
|
S. Thomas, S. Bodagala, K. Alsabti, and S. Ranka. An efficient algorithm for the incremental updation of association rules. In 3rd International Conference on Knowledge Discovery and Data Mining (KDD 97), pages 263--266, New Port Beach, CA, USA, 1997. ACM Press.
|
| |
66
|
H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hatonen, and H. Mannila. Pruning and grouping of discovered association rules. In ECML-95 Workshop on Statistics, Machine Learning, and Knowledge Discovery in Databases, pages 47--52, Heraklion, Greece, 1995.
|
 |
67
|
|
 |
68
|
|
| |
69
|
K. Wahlstrom, J. F. Roddick, R. Sarre, V. Estivill-Castro, and D. de Vries. Legal and technical issues of privacy preservation in data mining. In J. Wang, editor, Encyclopedia of Data Warehousing and Mining. IGI Publishing, 2nd edition, 2008.
|
 |
70
|
Ke Wang , Senqiang Zhou , Yu He, Growing decision trees on support-less association rules, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.265-269, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347147]
|
| |
71
|
|
| |
72
|
|
 |
73
|
|
| |
74
|
X. Yin and J. Han. CPAR: Classification based on predictive association rules. In SIAM Conference on Data Mining, SDM'03, pages 369--376, San Francisco, California, USA, 2003. SIAM.
|
| |
75
|
C. Zhang, M. Liu, W. Nie, and S. Zhang. Identifying global exceptional patterns in multi-database mining. IEEE Computational Intelligence Bulletin, 3(1):19--24, 2004.
|
| |
76
|
S. Zhang, X. Wu, and C. Zhang. Multi-database mining. IEEE Computational Intelligence Bulletin, 2(1): 5--13, 2003.
|
 |
77
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
|