|
ABSTRACT
Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far.In this paper, we present a set of language extensions and a prototype compiler for supporting high-level object-oriented programming of data intensive reduction operations over multidimensional data. We have chosen a dialect of Java with data-parallel extensions for specifying collection of objects, a parallel for loop, and reduction variables as our source high-level language. Our compiler analyzes parallel loops and optimizes the processing of datasets through the use of an existing run-time system, called Active Data Repository (ADR). We show how loop fission followed by interprocedural static program slicing can be used by the compiler to extract required information for the run-time system. We present the design of a compiler/n-time interface which allows the compiler to effectively utilize the existing run-time system.A prototype compiler incorporating these techniques has been developed using the Titanium front-end from Berkeley. We have evaluated this compiler by comparing the performance of compiler generated code with hand customized ADR code for three templates, from the areas of digital microscopy and scientific simulations. Our experimental results show that the performance of compiler generated versions is, on the average 21% lower, and in all cases within a factor of two, of the performance of hand coded versions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A. Bik, J. Villacis, and D. Gannon. javar: A prototype Java restructing compiler. Concurrency Practice and Ezperience, 9(11):1181-91, November 1997.
|
| |
2
|
Francois Bodin, Peter Beckman, Dennis Gannon, Srinivas Narayana, and Shelby X. Yang. Distributed pC.t-+: Basic ideas for an object parallel language. Scientific Programming, 2(3), Fall 1993.
|
 |
3
|
Rajesh Bordawekar , Alok Choudhary , Ken Kennedy , Charles Koelbel , Michael Paleczny, A model and compilation strategy for out-of-core data parallel programs, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.1-10, July 19-21, 1995, Santa Barbara, California, United States
|
| |
4
|
Bryan Carpenter, Guansong Zhan, Geoffrey Fox, Yuhong Wen, and Xinyng Li. HPJava: Data-parallel extensions to Java. Available from http ://wwv. npac. syr. edu/proj ecCs/pcrc/July97/doc, hCml, December 1997.
|
| |
5
|
|
 |
6
|
|
 |
7
|
|
 |
8
|
Phyllis E. Crandall , Ruth A. Aydt , Andrew A. Chien , Daniel A. Reed, Input/output characteristics of scalable parallel applications, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.59-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224396]
|
| |
9
|
Renato Ferreira , Bongki Moon , Jim Humphries , Alan Sussman , Joel Saltz , Robert Miller , Angelo Demarzo, The virtual microscope, University of Maryland at College Park, College Park, MD, 1997
|
| |
10
|
High Performance Fortran Forum. Hpf language specification, version 2.0. Available from http://www.crpe.rice.edu/HPFF/versions/hpfP/files/hpfv20.ps.gz, January 1997.
|
 |
11
|
|
| |
12
|
|
| |
13
|
David Kotz. Disk-directed I/O for MIMD multiprocessors. In Proceedings of the 1994 Symposium on Operating Systems Design and Implementation, pages 61-74. ACM Press, November 1994.
|
| |
14
|
Tahsin M. Kurc, Alan Sussman, and Joel Saltz. Coupling multiple simulations via a high performance customizable database system. In Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing. SIAM, March 1999.
|
| |
15
|
|
 |
16
|
Todd C. Mowry , Angela K. Demke , Orran Krieger, Automatic compiler-inserted I/O prefetching for out-of-core applications, Proceedings of the second USENIX symposium on Operating systems design and implementation, p.3-17, October 29-November 01, 1996, Seattle, Washington, United States
|
| |
17
|
|
 |
18
|
Thomas Reps , Susan Horwitz , Mooly Sagiv , Genevieve Rosay, Speeding up slicing, Proceedings of the 2nd ACM SIGSOFT symposium on Foundations of software engineering, p.11-20, December 06-09, 1994, New Orleans, Louisiana, United States
|
| |
19
|
|
| |
20
|
K. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Libit, A. Krishnamurthy, P. Hilfinger, S. Graham, D. Gay, P. Colella, and A. Aiken. Titanium: A high-performance Java dialect. Concurrency Practice and Ezperience, 9(11) November 1998.
|
CITED BY 5
|
|
Michael Beynon , Chialin Chang , Umit Catalyurek , Tahsin Kurc , Alan Sussman , Henrique Andrade , Renato Ferreira , Joel Saltz, Processing large-scale multi-dimensional data in parallel and distributed environments, Parallel Computing, v.28 n.5, p.827-859, May 2002
|
|
|
|
|
|
|
|
|
|
|
|
|
|