|
ABSTRACT
Aside from enhancing data availability during disk failures, replication of data is also used to speed up I/O performance of read-intensive applications. There are two issues that need to be addressed: (a) data placement (Which disks should store the copies of each data block?) and (b) scheduling (Given a query Q, and a placement scheme P of the data, from which disk should each block in Q be retrieved so that retrieval time is minimized?) In this paper, we consider range queries and assume that the dataset is a multidimensional grid and r copies of each unit block of the grid must be stored among M disks. To accurately measure performance of a scheduling algorithm, we consider a metric that takes into account the scheduling overhead as well as the time it takes to retrieve the data blocks from the disks. We describe several combinations of data placement schemes and scheduling algorithms and analyze their performance for range queries with respect to the above metric. We then present simulation results for the most interesting case r=2, showing that the strategies do perform better than the previously known method, especially for large queries.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Petra Berenbrink , Artur Czumaj , Angelika Steger , Berthold Vöcking, Balanced allocations: the heavily loaded case, Proceedings of the thirty-second annual ACM symposium on Theory of computing, p.745-754, May 21-23, 2000, Portland, Oregon, United States
[doi> 10.1145/335305.335411]
|
| |
2
|
Chialin Chang , Bongki Moon , Anurag Acharya , Carter Shock , Alan Sussman , Joel H. Saltz, Titan: A High-Performance Remote Sensing Database, Proceedings of the Thirteenth International Conference on Data Engineering, p.375-384, April 07-11, 1997
|
| |
3
|
|
 |
4
|
|
| |
5
|
Chung-Min Chen , Rakesh K. Sinha , Randeep Bhatia, Efficient Disk Allocation Schemes for Parallel Retrieval of Multidimensional Grid Data, Proceedings of the 13th International Conference on Scientific and Statistical Database Management, p.213-222, July 18-20, 2001
|
 |
6
|
|
| |
7
|
L. Ford, Jr and D. Fulkerson. Flows in Networks. Princeton University Press, Princeton, NJ, 1962.
|
| |
8
|
|
| |
9
|
M. Gutierrez. Storage of spatial data in a semantic database. Master's thesis, School of Computer Science, Florida International University, Miami, FL, 1997.
|
| |
10
|
|
| |
11
|
|
| |
12
|
Peter Sanders , Sebastian Egner , Jan Korst, Fast concurrent access to parallel disks, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.849-858, January 09-11, 2000, San Francisco, California, United States
|
| |
13
|
|
| |
14
|
J. Srivastava, T. Niccum, and B. Himatsingka. Data declustering in PADMA: A parallel database manager. IEEE Data Engineering Bulletin, 17(3):3--13, 1994.
|
| |
15
|
A. S. Tosun and H. Ferhatosmanoglu. Optimal parallel I/O using replication. In Int. Workshops on Parallel Processing, Vancouver, Canada, 2002.
|
| |
16
|
A. S. Tosun and H. Ferhatosmanoglu. Soda: A framework for strictly optimal disk allocation. submitted, 2002.
|
|