|
ABSTRACT
The problem of disk declustering is to distribute data among multiple disks to reduce query response times through parallel I/O. A strictly optimal declustering technique is one that achieves optimal parallel I/O for all possible queries. In this paper, we focus on techniques that are optimized for spatial range queries. Current declustering techniques, which have single copies of the data, have been shown to be suboptimal for range queries. The lower bound on extra disk accesses is proved to be Ω(log N) for N disks even in the restricted case of an N-by-N grid, and all current approaches have been trying to achieve this bound. Replication is a well-known and effective solution for several problems in databases, especially for availability and load balancing. In this paper, we explore the idea of replication in the context of declustering and propose a framework where strictly optimal parallel I/O is achievable using a small amount of replication. We provide some theoretical foundations for replicated declustering, e.g., a bound for number of copies for strict optimality on any number of disks, and propose a class of replicated declustering schemes, periodic allocations, which are shown to be strictly optimal. The results for optimal disk allocation are extended for larger number of disks by increasing replication. Our techniques and results are valid for any arbitrary a-by-b grids, and any declustering scheme can be further improved using our replication framework. Using the framework, we perform experiments to identify a strictly optimal disk access schedule for any given arbitrary range query. In addition to the theoretical bounds, we compare the proposed replication based scheme to other existing techniques by performing experiments on real datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
Norbert Beckmann , Hans-Peter Kriegel , Ralf Schneider , Bernhard Seeger, The R*-tree: an efficient and robust access method for points and rectangles, Proceedings of the 1990 ACM SIGMOD international conference on Management of data, p.322-331, May 23-26, 1990, Atlantic City, New Jersey, United States
|
 |
4
|
Stefan Berchtold , Christian Böhm , Bernhard Braunmüller , Daniel A. Keim , Hans-Peter Kriegel, Fast parallel similarity search in multimedia databases, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.1-12, May 11-15, 1997, Tucson, Arizona, United States
|
| |
5
|
|
| |
6
|
R. Bose and S. Shrikhande. On the construction of sets of mutually orthogonal latin squares and the falsity of a conjecture of euler. Euler. Trans. Am. Math. Sm., 95:191--209, 1960.
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
 |
10
|
Ming-Syan Chen , Hui-I Hsiao , Chung-Sheng Li , Philip S. Yu, Using rotational mirrored declustering for replica placement in a disk-array-based video server, Proceedings of the third ACM international conference on Multimedia, p.121-130, November 05-09, 1995, San Francisco, California, United States
[doi> 10.1145/217279.215257]
|
| |
11
|
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
H. Ferhatosmanoglu, A. S. Tosun, and A. Ramachandran. Replicated declustering of spatial data: A technical report. http://www.cse.ohiostate.edu/~hakan/repdec-report.pdf, 2003.
|
 |
17
|
|
| |
18
|
|
 |
19
|
Shahram Ghandeharizadeh , David J. DeWitt , Waheed Qureshi, A performance analysis of alternative multi-attribute declustering strategies, Proceedings of the 1992 ACM SIGMOD international conference on Management of data, p.29-38, June 02-05, 1992, San Diego, California, United States
|
| |
20
|
L. Golubchik , S. Khanna , S. Khuller , R. Thurimella , A. Zhu, Approximation algorithms for data placement on parallel disks, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.223-232, January 09-11, 2000, San Francisco, California, United States
|
| |
21
|
|
 |
22
|
|
| |
23
|
|
 |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
R. Muntz, J. Santos, and S. Berson. A parallel disk storage system for real-time multimedia applications. International Journal of Intelligent Systems, Special Issue on Multimedia Computing System, 13(12):1137--74, December 1998.
|
| |
28
|
|
 |
29
|
|
| |
30
|
|
| |
31
|
Peter Sanders , Sebastian Egner , Jan Korst, Fast concurrent access to parallel disks, Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, p.849-858, January 09-11, 2000, San Francisco, California, United States
|
| |
32
|
J. Santos and R. Muntz. Design of the RIO (randomized I/O) storage server. Technical Report TR970032, UCLA Computer Science Department, 1997. http://mml.cs.ucla.edu/publications/papers/cstech970032.ps.
|
| |
33
|
Seagate. Seagate specifications. http://www.seagate.com/pdf/datasheets/, December 2003.
|
| |
34
|
|
 |
35
|
|
| |
36
|
|
|