|
ABSTRACT
The Grid environment facilitates collaborative work and allows many users to query and process data over geographically dispersed data repositories. Over the past several years, there has been a growing interest in developing applications that interactively analyze datasets, potentially in a collaborative setting. We describe the Active Proxy-G service that is able to cache query results, use those results for answering new incoming queries, generate subqueries for the parts of a query that cannot be produced from the cache, and submit the subqueries for final processing at application servers that store the raw datasets. We present an experimental evaluation to illustrate the effects of various design tradeoffs. We also show the benefits that two real applications gain from using the middleware.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Aeschlimann, P. Dinda, L. Kallivokas, J. López, B. Lowekamp, and D. O'Hallaron. Preliminary report on the design of a framework for distributed visualization. In Proceedings of the Parallel and Distributed Processing Techniques and Applications (PDPTA99), Las Vegas, NV, 1999.
|
| |
2
|
A. Afework, M. D. Beynon, F. Bustamante, A. Demarzo, R. Ferreira, R. Miller, M. Silberman, J. Saltz, A. Sussman, and H. Tsang. Digital dynamic telepathology - the Virtual Microscope. In AMIA98. American Medical Informatics Association, November 1998. Also available as University of Maryland Technical Report CS-TR-3892 and UMIACS-TR-98-23.
|
 |
3
|
Bill Allcock , Ian Foster , Veronika Nefedova , Ann Chervenak , Ewa Deelman , Carl Kesselman , Jason Lee , Alex Sim , Arie Shoshani , Bob Drach , Dean Williams, High-performance remote access to climate simulation data: a challenge problem for data grid technologies, Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM), p.46-46, November 10-16, 2001, Denver, Colorado
[doi> 10.1145/582034.582080]
|
| |
4
|
K. Amiri, D. Petrou, G. R. Ganger, and G. A. Gibson. Dynamic function placement for data-intensive cluster computing. In Proceedings of the USENIX Annual Technical Conference, San Diego, CA, 2000.
|
| |
5
|
H. Andrade, T. Kurc, U. Catalyurek, A. Sussman, and J. Saltz. Persistent caching in a multiple query optimization framework. In Proceedings of the 6th Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, Washington, DC, March 2002.
|
| |
6
|
H. Andrade, T. Kurc, A. Sussman, E. Borovikov, and J. Saltz. On cache replacement policies for servicing mixed data intensive query workloads. In Proceedings of the 2nd Workshop on Caching, Coherence, and Consistency, held in conjunction with the 16th ACM International Conference on Supercomputing, New York, NY, June 2002.
|
 |
7
|
Henrique Andrade , Tahsin Kurc , Alan Sussman , Joel Saltz, Efficient execution of multiple query workloads in data analysis applications, Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM), p.53-53, November 10-16, 2001, Denver, Colorado
[doi> 10.1145/582034.582087]
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
Remzi H. Arpaci-Dusseau , Eric Anderson , Noah Treuhaft , David E. Culler , Joseph M. Hellerstein , David Patterson , Kathy Yelick, Cluster I/O with River: making the fast case common, Proceedings of the sixth workshop on I/O in parallel and distributed systems, p.10-22, May 05-05, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301816.301823]
|
| |
12
|
Wes Bethel , Brian Tierney , Jason lee , Dan Gunter , Stephen Lau, Using high-speed WANs and network data caches to enable remote and distributed visualization, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.28-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
E. Borovikov, A. Sussman, and L. Davis. An efficient system for multi-perspective imaging and volumetric shape analysis. In Proceedings of the 2001 Workshop on Parallel and Distributed Computing in Imaging Processing, Video Processing, and Multimedia, San Francisco, CA, 2001.
|
| |
17
|
Randall Bramley , Kenneth Chiu , Shridhar Diwan , Dennis Gannon , Madhusudhan Govindaraju , Nirmal Mukhi , Benjamin Temko , Madhuri Yechuri, A Component Based Services Architecture for Building Distributed Applications, Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing (HPDC'00), p.51, August 01-04, 2000
|
| |
18
|
Common Component Architecture Forum. http://www.cca-forum.org.
|
| |
19
|
|
| |
20
|
|
| |
21
|
I. Foster, C. Kesselman, J. Nick, and S. Tuecke. The physiology of the grid - an open grid services architecture for distributed systems integration, 2002. Draft document available at http://www.globus.org/research/papers/ogsa.pdf.
|
 |
22
|
Ian Foster , Carl Kesselman , Gene Tsudik , Steven Tuecke, A security architecture for computational grids, Proceedings of the 5th ACM conference on Computer and communications security, p.83-92, November 02-05, 1998, San Francisco, California, United States
[doi> 10.1145/288090.288111]
|
| |
23
|
Global Grid Forum. http://www.gridforum.org.
|
| |
24
|
|
| |
25
|
W. Johnston, J. Guojun, G. Hoo, C. Larsen, J. Lee, B. Tierney, and M. Thompson. Distributed environments for large data-objects: Broadband networks and a new view of high performance, large scale storage-based applications. In Proceedings of Internetworking'96, Nara, Japan, September 1996.
|
| |
26
|
Ken Kennedy , Mark Mazina , John M. Mellor-Crummey , Keith D. Cooper , Linda Torczon , Francine Berman , Andrew A. Chien , Holly Dail , Otto Sievert , Dave Angulo , Ian T. Foster , Ruth A. Aydt , Daniel A. Reed , Dennis Gannon , S. Lennart Johnsson , Carl Kesselman , Jack Dongarra , Sathish S. Vadhiyar , Richard Wolski, Toward a Framework for Preparing and Executing Adaptive Grid Programs, Proceedings of the 16th International Parallel and Distributed Processing Symposium, p.322, April 15-19, 2002
|
| |
27
|
|
| |
28
|
B. Plale, P. Dinda, M. Helm, G. von Laszewski, and J. McGee. Key concepts and services of a grid information service, February 2002. Draft document available at http://www.cs.indiana.edu/plale/GISggf4.pdf.
|
| |
29
|
|
 |
30
|
|
| |
31
|
SRB: The Storage Resource Broker. http://www.npaci.edu/DICE/SRB/index.html.
|
| |
32
|
|
| |
33
|
B. Tierney, W. Johnston, J. Lee, G. Hoo, and M. Thompson. An overview of the distributed parallel storage server (DPSS). Available at http://www-didc.lbl.gov/DPSS/Overview/DPSS.handout.fm.html.
|
| |
34
|
D. Wessels and K. C. Claffy. ICP and the Squid web cache. IEEE Journal on Selected Areas in Communications, 16(3):345--357, April 1998.
|
 |
35
|
Rich Wolski , John Brevik , Chandra Krintz , Graziano Obertelli , Neil Spring , Alan Su, Running EveryWare on the computational grid, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.6-es, November 14-19, 1999, Portland, Oregon, United States
[doi> 10.1145/331532.331538]
|
| |
36
|
|
|