ACM Home Page
Please provide us with feedback. Feedback
Integrating association rule mining with relational database systems: alternatives and implications
Full text PdfPdf (2.03 MB)
Source International Conference on Management of Data archive
Proceedings of the 1998 ACM SIGMOD international conference on Management of data table of contents
Seattle, Washington, United States
Pages: 343 - 354  
Year of Publication: 1998
ISBN:0-89791-995-5
Also published in ...
Authors
Sunita Sarawagi  IBM Almaden Research Center, 650 Harry Road, San Jose, CA
Shiby Thomas  Dept. of Computer & Information Science & Engineering, University of Florida, Gainesville and IBM Almaden Research Center, 650 Harry Road, San Jose, CA
Rakesh Agrawal  IBM Almaden Research Center, 650 Harry Road, San Jose, CA
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 66,   Citation Count: 69
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/276304.276335
What is a DOI?

ABSTRACT

Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL cursor interface; encapsulation of a mining algorithm in a stored procedure; caching the data to a file system on-the-fly and mining; tight-coupling using primarily user-defined functions; and SQL implementations for processing in the DBMS. We comprehensively study the option of expressing the mining algorithm in the form of SQL queries using Association rule mining as a case in point. We consider four options in SQL-92 and six options in SQL enhanced with object-relational extensions (SQL-OR). Our evaluation of the different architectural alternatives shows that from a performance perspective, the Cache-Mine option is superior, although the performance of the SQL-OR option is within a factor of two. Both the Cache-Mine and the SQL-OR approaches incur a higher storage penalty than the loose-coupling approach which performance-wise is a factor of 3 to 4 worse than Cache-Mine. The SQL-92 implementations were too slow to qualify as a competitive option. We also compare these alternatives on the basis of qualitative factors like automatic parallelization, development ease, portability and inter-operability.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Agrawal, A. Arning, T. BoUinger, M. Mehta, J. Shafer, and R. Srikant. The Quest Data Mining System. In Proc. of the ~nd Int'l Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, August 1996.
2
 
3
 
4
 
5
R. Agrawal and K. Shim. Developing tightly-coupled data mining applications on a relational database system. In Proc. o/the ~nd Int'l Conference on Knowledge Discovery in Databases and Data Mining, Portland, Oregon, August 1996.
6
 
7
 
8
 
9
J. Han, Y. Fu, K. Koperski, W. Wang, and O. Zaiane. DMQL: A data mining query language for relational datbases. In Proc. of the 1996 SIGMOD workshop on research issues on data mining and knowledge discovery, Montreal, Canada, May 1996.
 
10
 
11
IBM Corporation. DB~ Universal Database Application programming guide Version 5, 1997.
12
 
13
T. Imielinski, A. Virmani, and A. Abdulghani. Discovery Board Application Programming Interface and Query Language for Database Mining. In Proc. of the ~nd Int'l Conference on Knowledge Discovery and Data Mining, Portland, Oregon, August 1996.
 
14
internationl Business Machines. IBM Intelligent Miner User's Guide, Version 1 Release 1, SH12-6213-00 edition, July 1996.
15
 
16
 
17
 
18
Oracle. Oracle RDBMS Database Administrator's Guide Volumes I, II (Version 7.0), May 1992.
 
19
H. Pirahesh and B. Reinwald. SQL table function open architecture and data access middleware. In SIGMOD, 1998.
 
20
K. Rajamani, B. Iyer, and A. Chaddha. Using DB/2's object relational extensions for mining associations rules. Technical Report TR 03,690., Santa Teresa Laboratory, IBM Corporation, sept 1997.
 
21
S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. Research Report RJ 10107 (91923), IBM Almaden Research Center, San jose, CA 95120, March 1998. Available from http : //~. almaden, ibm. corn/us/quest.
 
22
 
23
 
24
25
 
26
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New Algorithms for Fast Discovery of Association Rules. In Proc. of the 3rd Int'l Conference on Knowledge Discovery and Data Mining, Newport Beach, California, August 1997.

CITED BY  69

Collaborative Colleagues:
Sunita Sarawagi: colleagues
Shiby Thomas: colleagues
Rakesh Agrawal: colleagues