|
ABSTRACT
This paper proposes a scheme for scheduling disk requests that takes advantage of the ability of high-level functions to operate directly at individual disk drives. We show that such a scheme makes it possible to support a Data Mining workload on an OLTP system almost for free: there is only a small impact on the throughput and response time of the existing workload. Specifically, we show that an OLTP system has the disk resources to consistently provide one third of its sequential bandwidth to a background Data Mining task with close to zero impact on OLTP throughput and response time at high transaction loads. At low transaction loads, we show much lower impact than observed in previous work. This means that a production OLTP system can be used for Data Mining tasks without the expense of a second dedicated system. Our scheme takes advantage of close interaction with the on-disk scheduler by reading blocks for the Data Mining workload as the disk head “passes over” them while satisfying demand blocks from the OLTP request stream. We show that this scheme provides a consistent level of throughput for the background workload even at very high foreground loads. Such a scheme is of most benefit in combination with an Active Disk environment that allows the background Data Mining application to also take advantage of the processing power and memory available directly on the disk drives.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
Acharya98
|
Acharya, A., Uysal, M. and Saltz, J. "Active Disks" ASPLOS, October 1998.
|
| |
Agrawal96
|
|
| |
Brown92
|
Brown, K., Carey, M., DeWitt, D., Mehta, M. and Naughton, J. "Resource Allocation and Scheduling for Mixed Database Workloads" Technical Report, University of Wisconsin, 1992.
|
| |
Brown93
|
|
 |
Chaudhuri97
|
|
| |
Cirrus98
|
Cirrus Logic, Inc. "New Open-Processor Platform Enables Cost-Effective, System-on-a-chip Solutions for Hard Disk Drives" www.cirrus.com/3ci, June 1998.
|
| |
Denning67
|
Denning, P.J. "Effects of Scheduling on File Memory Operations" AFIPS Spring Joint Computer Conference, April 1967.
|
| |
Fayyad98
|
Fayyad, U. "Taming the Giants and the Monsters: Mining Large Databases for Nuggets of Knowledge" Database Programming and Design, March 1998.
|
| |
Ganger98
|
Ganger, G.R., Worthington, B.L. and Patt, Y.N. "The DiskSim Simulation Environment Version 1.0 Reference Manual" Technical Report, University of Michigan, February 1998.
|
| |
Gray97
|
Gray, J. "What Happens When Processing, Storage, and Bandwidth are Free and Infinite?" IOPADS Keynote, November 1997.
|
 |
Guha98
|
Sudipto Guha , Rajeev Rastogi , Kyuseok Shim, CURE: an efficient clustering algorithm for large databases, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.73-84, June 01-04, 1998, Seattle, Washington, United States
|
| |
HP98
|
Hewlett-Packard Company "HP to Deliver Enterprise-Class Storage Area Network Management Solution" News Release, October 1998.
|
| |
IBM99
|
IBM Corporation and International Data Group "Survey says Storage Area Networks may unclog future roadblocks to e-Business" News Release, December 1999.
|
 |
Keeton98
|
|
| |
Korn98
|
|
| |
Paulin97
|
Paulin, J. "Performance Evaluation of Concurrent OLTP and DSS Workloads in a Single Database System" Master's Thesis, Carleton University, November 1997.
|
| |
Riedel98
|
|
| |
Ruemmler94
|
|
| |
Seagate98
|
Seagate Technology, Inc. "Storage Networking: The Evolution of Information Management" White Paper, November 1998.
|
| |
Siemens98
|
Siemens Microelectronics, Inc. "Siemens Announces Availability of TriCore-1 For New Embedded System Designs" News Release, March 1998.
|
| |
Veritas99
|
Veritas Software Corporation "Veritas Software and Other Industry Leaders Demonstrate SAN Solutions" News Release, May 1999.
|
 |
Widom95
|
|
 |
Worthington94
|
Bruce L. Worthington , Gregory R. Ganger , Yale N. Patt, Scheduling algorithms for modern disk drives, Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems, p.241-251, May 16-20, 1994, Nashville, Tennessee, United States
|
 |
Worthington95
|
Bruce L. Worthington , Gregory R. Ganger , Yale N. Patt , John Wilkes, On-line extraction of SCSI disk drive parameters, Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, p.146-156, May 15-19, 1995, Ottawa, Ontario, Canada
|
| |
Zhang97
|
|
CITED BY 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John Linwood Griffin , Steven W. Schlosser , Gregory R. Ganger , David F. Nagle, Operating system management of MEMS-based storage devices, Proceedings of the 4th conference on Symposium on Operating System Design & Implementation, p.16-16, October 22-25, 2000, San Diego, California
|
|
|
Christopher R. Lumb , Jiri Schindler , Gregory R. Ganger , David F. Nagle , Erik Riedel, Towards higher disk head utilization: extracting free bandwidth from busy disk drives, Proceedings of the 4th conference on Symposium on Operating System Design & Implementation, p.7-7, October 22-25, 2000, San Diego, California
|
|
|
Spiros Papadimitriou , Anthony Brockwell , Christos Faloutsos, Adaptive, hands-off stream mining, Proceedings of the 29th international conference on Very large data bases, p.560-571, September 09-12, 2003, Berlin, Germany
|
|
|
|
|
|
|
|