|
ABSTRACT
Declustering schemes allocate data blocks among multiple disks to enable parallel retrieval. Given a declustering scheme D, its response time with respect to a query Q, rt(Q), is defined to be the maximum number of disk blocks of the query stored by the scheme in any one of the disks. If |Q| is the number of data blocks in Q and M is the number of disks then rt(Q) is at least |Q|/M. One way to evaluate the performance of D with respect to a set of queries 𝑄 is to measure its additive error - the maximum difference between rt(Q) from |Q|/M over all range queries Q ε 𝑄.In this paper, we consider the problem of designing declustering schemes for uniform multidimensional data arranged in a d-dimensional grid so that their additive errors with respect to range queries are as small as possible. It has been shown that such declustering schemes will have an additive error of Ω(log M) when d = 2 and Ω(log d-1/2 M) when d > 2 with respect to range queries.Asymptotically optimal declustering schemes exist for 2-dimensional data. For data in larger dimensions, however, the best bound for additive errors is O(Md-1), which is extremely large. In this paper, we propose the two declustering schemes based on low discrepancy points in d-dimensions. When d is fixed, both schemes have an additive error of O(logd-1 M) with respect to range queries provided certain conditions are satisfied: the first scheme requires d ≥ 3 and M to be a power of a prime where the prime is at least d while the second scheme requires the size of the data to grow within some polynomial of M, with no restriction on M. These are the first known multidimensional declustering schemes with additive errors near optimal.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
NCR WorldMark/Teradata 1 TB TPC-D Executive Summary. www.tpc.org.
|
| |
2
|
Sun Fire 6800 midframe server with Sun Storedge arrays. www.sun.com/smi/Press/sunflash/2002-05/sunflash.20020514.2.html.
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
Chialin Chang , Bongki Moon , Anurag Acharya , Carter Shock , Alan Sussman , Joel H. Saltz, Titan: A High-Performance Remote Sensing Database, Proceedings of the Thirteenth International Conference on Data Engineering, p.375-384, April 07-11, 1997
|
| |
8
|
|
| |
9
|
Chung-Min Chen , Rakesh K. Sinha , Randeep Bhatia, Efficient Disk Allocation Schemes for Parallel Retrieval of Multidimensional Grid Data, Proceedings of the 13th International Conference on Scientific and Statistical Database Management, p.213-222, July 18-20, 2001
|
| |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
H. Faure. Discrepancy of sequences associated with a number system (in dimension s) (in french). Acta Arithmetica, 41(4):337-351, 1982.
|
| |
16
|
|
| |
17
|
J. H. Halton. On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numerische Mathematik, 2:84-90, 1960.
|
| |
18
|
B. Himatsingka, J. Srivastava, J.-Z. Li, and D. Rotem. Latin hypercubes: A class of multidimensional declustering techniques, 1994. Technical Report TR94-05, Department of Computer Science, University of Minnesota.
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
J. Matousek. Geometric discrepancy, an illustrated guide. Springer-Verlag, 1999.
|
| |
25
|
|
| |
26
|
|
| |
27
|
H. Niederreiter. Nets, (t, s) sequences, and algebraic curves over finite fields with many rational points. Documenta Math. J. DMV, 3:377-386, 1998.
|
| |
28
|
|
| |
29
|
|
| |
30
|
H. E. Rose. A course in number theory. Oxford University Press, Walton Street, Oxford, 1988.
|
| |
31
|
K. F. Roth. On irregularities of distribution. Mathematika, 1:73-79, 1954.
|
| |
32
|
W. Schmidt. On irregularities of distribution vii. Acta Arithmetica, 21:45-50, 1972.
|
| |
33
|
|
| |
34
|
J. Srivastava, T. Niccum, and B. Himatsingka. Data declustering in PADMA: A parallel database manager. Bulletin of Technical Communication on Data Engineering, 17(3):3-13, 1994.
|
| |
35
|
|
|