|
ABSTRACT
E-science communities face huge data management challenges due to large existing data sets and expected data rates from forthcoming projects. Community-driven data grids provide a scalable, high-throughput oriented data management solution for scientific federations by employing domain-specific partitioning schemes and parallelism. In this paper, we present how community-driven data grids can adapt their query coordination strategies in the face of different typical submission scenarios. We explore the impact of submitting queries uniformly or having submission hot spots. By an extensive evaluation of five strategies on simulated and distributed setups, we show that some coordination strategies are preferable to others, regardless of submission skew. Based on our results, we can improve the usability and scalability of community-driven data grids for data-intensive applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Adina Crainiceanu , Prakash Linga , Ashwin Machanavajjhala , Johannes Gehrke , Jayavel Shanmugasundaram, P-ring: an efficient and robust P2P range index structure, Proceedings of the 2007 ACM SIGMOD international conference on Management of data, June 11-14, 2007, Beijing, China
[doi> 10.1145/1247480.1247507]
|
 |
3
|
|
| |
4
|
C. du Mouza, W. Litwin, and P. Rigaux. SD-Rtree: A Scalable Distributed Rtree. In Proc. of the Intl. Conf. on Data Engineering, pages 296--305, Istanbul, Turkey, Apr. 2007.
|
| |
5
|
H. Enke, M. Steinmetz, T. Radke, A. Reiser, T. Röblitz, and M. Högqvist. AstroGrid-D: Enhancing Astronomic Science with Grid Technology. In Proc. of the German e-Science Conference, Baden-Baden, Germany, May 2007.
|
| |
6
|
R. A. Finkel and J. L. Bentley. Quad Trees: A Data Structure for Retrieval on Composite Keys. Acta Informatica, 4:1--9, Mar. 1974.
|
 |
7
|
|
| |
8
|
D. Hilbert. Über die stetige Abbildung einer Linie auf ein Flächenstück. Math. Ann., 38:459--460, 1891.
|
 |
9
|
|
| |
10
|
Richard Kuntschke , Tobias Scholl , Sebastian Huber , Alfons Kemper , Angelika Reiser , Hans-Martin Adorf , Gerard Lemson , Wolfgang Voges, Grid-Based Data Stream Processing in e-Science, Proceedings of the Second IEEE International Conference on e-Science and Grid Computing, p.30, December 04-06, 2006
[doi> 10.1109/E-SCIENCE.2006.78]
|
| |
11
|
V. Markl and R. Bayer. Processing Relational OLAP Queries with UB-Trees and Multidimensional Hierarchical Clustering. In Proc. of the Intl. Workshop on Design and Management of Data Warehouses, page 1, Stockholm, Sweden, June 2000.
|
 |
12
|
|
| |
13
|
T. Pitoura, N. Ntarmos, and P. Triantafillou. Replication, Load Balancing, and Efficient Range Query Processing in DHT Data Networks. In Proc. of the Intl. Conf. on Extending Database Technology, pages 131--148, Munich, Germany, Mar. 2006.
|
 |
14
|
Viswanath Poosala , Peter J. Haas , Yannis E. Ioannidis , Eugene J. Shekita, Improved histograms for selectivity estimation of range predicates, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.294-305, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
Tobias Scholl , Bernhard Bauer , Benjamin Gufler , Richard Kuntschke , Angelika Reiser , Alfons Kemper, Scalable community-driven data sharing in e-science grids, Future Generation Computer Systems, v.25 n.3, p.290-300, March, 2009
[doi> 10.1016/j.future.2008.05.006]
|
| |
19
|
Tobias Scholl , Bernhard Bauer , Benjamin Gufler , Richard Kuntschke , Daniel Weber , Angelika Reiser , Alfons Kemper, HiSbase: histogram-based P2P main memory data management, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
 |
20
|
Tobias Scholl , Bernhard Bauer , Jessica Müller , Benjamin Gufler , Angelika Reiser , Alfons Kemper, Workload-aware data partitioning in community-driven data grids, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
[doi> 10.1145/1516360.1516366]
|
| |
21
|
|
| |
22
|
|
| |
23
|
V. Singh, J. Gray, A. Thakar, A. Szalay, J. Raddick, B. Boroski, S. Lebedeva, and B. Yanny. SkyServer Traffic Report - The First Five Years. Technical Report MS-TR-2006-190, Microsoft Research, Microsoft Cooperation, Redmond, WA, USA, Dec. 2006.
|
 |
24
|
Ion Stoica , Robert Morris , David Karger , M. Frans Kaashoek , Hari Balakrishnan, Chord: A scalable peer-to-peer lookup service for internet applications, Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, p.149-160, August 2001, San Diego, California, United States
|
| |
25
|
|
 |
26
|
Alexander S. Szalay , Jim Gray , Ani R. Thakar , Peter Z. Kunszt , Tanu Malik , Jordan Raddick , Christopher Stoughton , Jan vandenBerg, The SDSS skyserver: public access to the sloan digital sky server data, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin
[doi> 10.1145/564691.564758]
|
 |
27
|
|
|