ACM Home Page
Please provide us with feedback. Feedback
From discrepancy to declustering: near-optimal multidimensional declustering strategies for range queries
Full text PdfPdf (215 KB)
Source Symposium on Principles of Database Systems archive
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Madison, Wisconsin
SESSION: Research session 1: award winning papers table of contents
Pages: 29 - 38  
Year of Publication: 2002
ISBN:1-58113-507-6
Authors
Chung-Min Chen  Telcordia Technologies, Morristown, NJ
Christine T. Cheng  Institute for Mathematics & its Applications, Minneapolis, MN
Sponsors
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGMOD: ACM Special Interest Group on Management of Data
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 22,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/543613.543618
What is a DOI?

ABSTRACT

Declustering schemes allocate data blocks among multiple disks to enable parallel retrieval. Given a declustering scheme D, its response time with respect to a query Q, rt(Q), is defined to be the maximum number of disk blocks of the query stored by the scheme in any one of the disks. If |Q| is the number of data blocks in Q and M is the number of disks then rt(Q) is at least |Q|/M. One way to evaluate the performance of D with respect to a set of queries 𝑄 is to measure its additive error - the maximum difference between rt(Q) from |Q|/M over all range queries Q ε 𝑄.In this paper, we consider the problem of designing declustering schemes for uniform multidimensional data arranged in a d-dimensional grid so that their additive errors with respect to range queries are as small as possible. It has been shown that such declustering schemes will have an additive error of Ω(log M) when d = 2 and Ω(log d-1/2 M) when d > 2 with respect to range queries.Asymptotically optimal declustering schemes exist for 2-dimensional data. For data in larger dimensions, however, the best bound for additive errors is O(Md-1), which is extremely large. In this paper, we propose the two declustering schemes based on low discrepancy points in d-dimensions. When d is fixed, both schemes have an additive error of O(logd-1 M) with respect to range queries provided certain conditions are satisfied: the first scheme requires d ≥ 3 and M to be a power of a prime where the prime is at least d while the second scheme requires the size of the data to grow within some polynomial of M, with no restriction on M. These are the first known multidimensional declustering schemes with additive errors near optimal.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
NCR WorldMark/Teradata 1 TB TPC-D Executive Summary. www.tpc.org.
 
2
Sun Fire 6800 midframe server with Sun Storedge arrays. www.sun.com/smi/Press/sunflash/2002-05/sunflash.20020514.2.html.
 
3
4
 
5
 
6
 
7
 
8
 
9
Chung-Min Chen , Rakesh K. Sinha , Randeep Bhatia, Efficient Disk Allocation Schemes for Parallel Retrieval of Multidimensional Grid Data, Proceedings of the 13th International Conference on Scientific and Statistical Database Management, p.213-222, July 18-20, 2001
 
10
11
 
12
 
13
 
14
 
15
H. Faure. Discrepancy of sequences associated with a number system (in dimension s) (in french). Acta Arithmetica, 41(4):337-351, 1982.
 
16
 
17
J. H. Halton. On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numerische Mathematik, 2:84-90, 1960.
 
18
B. Himatsingka, J. Srivastava, J.-Z. Li, and D. Rotem. Latin hypercubes: A class of multidimensional declustering techniques, 1994. Technical Report TR94-05, Department of Computer Science, University of Minnesota.
 
19
20
 
21
 
22
 
23
 
24
J. Matousek. Geometric discrepancy, an illustrated guide. Springer-Verlag, 1999.
 
25
 
26
 
27
H. Niederreiter. Nets, (t, s) sequences, and algebraic curves over finite fields with many rational points. Documenta Math. J. DMV, 3:377-386, 1998.
 
28
 
29
 
30
H. E. Rose. A course in number theory. Oxford University Press, Walton Street, Oxford, 1988.
 
31
K. F. Roth. On irregularities of distribution. Mathematika, 1:73-79, 1954.
 
32
W. Schmidt. On irregularities of distribution vii. Acta Arithmetica, 21:45-50, 1972.
 
33
 
34
J. Srivastava, T. Niccum, and B. Himatsingka. Data declustering in PADMA: A parallel database manager. Bulletin of Technical Communication on Data Engineering, 17(3):3-13, 1994.
 
35

CITED BY  11

Collaborative Colleagues:
Chung-Min Chen: colleagues
Christine T. Cheng: colleagues