ACM Home Page
Please provide us with feedback. Feedback
Workload-aware data partitioning in community-driven data grids
Full text PdfPdf (1.40 MB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: System architectures table of contents
Pages 36-47  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Authors
Tobias Scholl  Technische Universität München, Munich, Germany
Bernhard Bauer  Technische Universität München, Munich, Germany
Jessica Müller  Technische Universität München, Munich, Germany
Benjamin Gufler  Technische Universität München, Munich, Germany
Angelika Reiser  Technische Universität München, Munich, Germany
Alfons Kemper  Technische Universität München, Munich, Germany
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 83,   Citation Count: 1
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516366
What is a DOI?

ABSTRACT

Collaborative research in various scientific disciplines requires support for scalable data management enabling the efficient correlation of globally distributed data sources. Motivated by the expected data rates of upcoming projects and a growing number of users, communities explore new data management techniques for achieving high throughput. Community-driven data grids deliver such high-throughput data distribution for scientific federations by partitioning data according to application-specific data and query characteristics. Query hot spots are an important and challenging problem in this environment. Existing approaches to load-balancing from Peer-to-Peer (P2P) data management and sensor networks do not directly meet the requirements of a data-intensive e-science environment. In this paper, our contributions are partitioning schemes based on multi-dimensional index structures enabling communities to trade off data load balancing and handling query hot spots via splitting and replication. We evaluate the partitioning schemes with two typical kinds of data sets from the astrophysics domain and workloads extracted from Sloan Digital Sky Survey (SDSS) query traces and perform throughput measurements in real and simulated networks. The experiments demonstrate the improved workload distribution capabilities and give promising directions for the development of future community grids.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
5
 
6
C. du Mouza, W. Litwin, and P. Rigaux. SD-Rtree: A Scalable Distributed Rtree. In ICDE, pages 296--305, Istanbul, Turkey, Apr. 2007.
 
7
H. Enke, M. Steinmetz, T. Radke, A. Reiser, T. Röblitz, and M. Högqvist. AstroGrid-D: Enhancing Astronomic Science with Grid Technology. In German e-Science Conference, Baden-Baden, Germany, May 2007.
 
8
R. A. Finkel and J. L. Bentley. Quad Trees: A Data Structure for Retrieval on Composite Keys. Acta Informatica, 4:1--9, Mar. 1974.
 
9
 
10
D. Hilbert. Über die stetige Abbildung einer Linie auf ein Flächenstück. Math. Ann., 38:459--460, 1891.
 
11
 
12
 
13
V. Markl and R. Bayer. Processing Relational OLAP Queries with UB-Trees and Multidimensional Hierarchical Clustering. In DMDW, page 1, Stockholm, Sweden, June 2000.
 
14
15
 
16
T. Pitoura, N. Ntarmos, and P. Triantafillou. Replication, Load Balancing, and Efficient Range Query Processing in DHT Data Networks. In EDBT, pages 131--148, Munich, Germany, Mar. 2006.
 
17
 
18
19
 
20
 
21
 
22
 
23
V. Singh, J. Gray, A. Thakar, A. Szalay, J. Raddick, B. Boroski, S. Lebedeva, and B. Yanny. SkyServer Traffic Report - The First Five Years. Technical Report MS-TR-2006-190, Microsoft Research, Microsoft Cooperation, Redmond, WA, USA, Dec. 2006.
 
24
V. Springel, S. D. M. White, A. Jenkins, C. S. Frenk, N. Yoshida, L. Gao, J. Navarro, R. Thacker, D. Croton, J. Helly, J. A. Peacock, S. Cole, P. Thomas, H. Couchman, A. Evrard, J. Colberg, and F. Pearce. Simulating the joint evolution of quasars, galaxies and their large-scale distribution. Nature, 435:629--636, June 2005.
25
26
27
 
28
X. Wang, R. Burns, A. Terzis, and A. Deshpande. Network-Aware Join Processing in Global-Scale Database Federations. In ICDE, pages 586--595, Cancun, Mexico, Apr. 2008.
 
29

Collaborative Colleagues:
Tobias Scholl: colleagues
Bernhard Bauer: colleagues
Jessica Müller: colleagues
Benjamin Gufler: colleagues
Angelika Reiser: colleagues
Alfons Kemper: colleagues