ACM Home Page
Please provide us with feedback. Feedback
Executing irregular scientific applications on stream architectures
Full text PdfPdf (435 KB)
Source
International Conference on Supercomputing archive
Proceedings of the 21st annual international conference on Supercomputing table of contents
Seattle, Washington
SESSION: Algorithms and applications II table of contents
Pages: 93 - 104  
Year of Publication: 2007
ISBN:978-1-59593-768-1
Authors
Mattan Erez  The University of Texas at Austin
Jung Ho Ahn  Hewlett-Packard Laboratories
Jayanth Gummaraju  Stanford University
Mendel Rosenblum  Stanford University
William J. Dally  Stanford University
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 84,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1274971.1274987
What is a DOI?

ABSTRACT

The recent emergence of compute-intensive stream processors such as the Cell Broadband Engine, Stanford's Merrimac, and Clear-Speed's CSX600 has made them attractive platforms for scientific high-performance computing. Unstructured mesh and graph applications are an important class of numerical algorithms used in the scientific computing domain, which are particularly challenging for stream architectures. These codes have irregular structures where nodes have a variable number of neighbors, resulting in irregular memory access patterns and irregular control. We study four representative sub-classes of irregular algorithms, including finite-element and finite-volume methods for modeling physical systems, direct methods for n-body problems, and computations involving sparse algebra. We propose a framework for representing the diverse characteristics of these algorithms in the context of the unique properties of stream architectures, and demonstrate it using one representative application from each sub-class. We then develop techniques for mapping the applications onto a stream processor, placing emphasis on data-localization and parallelizations. Our simulations show that efficient stream hardware with restricted control abilities can effectively run challenging irregular applications with, for example, a finite element method and a molecular dynamic code sustaining 69GFLOP/s and 46GFLOP/s (64-bit) respectively using a single chip that measures 12mm on a side and consumes less than 70W in 90nm technology.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
 
5
T. Barth. Simplified discontinuous Galerkin methods for systems of conservation laws with convex extension. In Cockburn, Karniadakis, and Shu, editors, Discontinuous Galerkin Methods, volume 11 of Lecture Notes in Computational Science and Engineering. Springer-Verlag, Heidelberg, 1999.
 
6
 
7
 
8
W. W. Carlson, J. M. Draper, D. E. Culler, K. Yelick, E. Brooks, and K. Warren. Introduction to UPC and language specification. University of California-Berkeley Technical Report: CCS-TR-99-157, 1999.
 
9
 
10
ClearSpeed. CSX600 Datasheet. http://www.clearspeed.com/downloads/CSX600Processor.pdf, 2005.
 
11
 
12
 
13
 
14
E. F. D'Azevedo, M. R. Fahey, and R. T. Mills. Vectorized sparse matrix multiply for compressed row storage format. In proceedings of the 2005 International Conference on Computational Science (ICCS'05), pages 99--106, May 2005.
15
 
16
S. J. Deitz, B. L. Chamberlain, and L. Snyder. Abstractions for dynamic data distribution. In Ninth International Workshop on High-Level Parallel Programming Models and Supportive Environments, pages 42--51. IEEE Computer Society, 2004.
17
 
18
N. Goharian, T. El-Ghazawi, D. Grossman, and A. Chowdhury. On the enhancements of a sparse matrix information retrieval approach. PDPTA, 2000.
 
19
M. Guo. Automatic parallelization and optimization for irregular scientic applications. In 18th International Parallel and Distributed Processing Symposium, 2005.
 
20
H. Han, G. Rivera, and C. Tseng. Software support for improving locality in scientic codes. In Compilers for Parallel Computation, 2000.
 
21
 
22
 
23
 
24
25
 
26
 
27
 
28
K. Kitagawa, S. Tagaya, Y. Hagihara, and Y. Kanoh. A hardware overview of SX-6 and SX-7 supercomputer. NEC Research and Development, 44(1):27, January 2003.
 
29
30
31
32
33
34
 
35
D. I. Pullin and D. J. Hill. Computational methods for shock-driven turbulence and les of the richtmyer-meshkov instability. USNCCM, 2003.
 
36
D. Roccatano, R. Bizzarri, G. Chillemi, N. Sanna, and A. D. Nola. Development of a parallel molecular dynamics code on SIMD computers: Algorithm for use of pair list criterion. Journal of Computational Chemistry, 19(7):685--694, 1998.
37
 
38
 
39
 
40
P. Sanders. Efficient emulation of MIMD behavior on SIMD machines. Technical Report iratr-1995-29, 1995.
 
41
42
 
43
T. Sterling, D. Savarese, D. J. Becker, J. E. Dorband, U. A. Ranawake, and C. V. Packer. BEOWULF: A parallel workstation for scientific computation. In Proceedings of the 24th International Conference on Parallel Processing, pages I:11--14, 1995.
 
44
 
45
 
46
D. van der Spoel, A. R. van Buuren, E. Apol, P. J. Meulenhoff, D. P. Tieleman, A. L. T. M. Sijbers, B. Hess, K. A. Feenstra, E. Lindahl, R. van Drunen, and H. J. C. Berendsen. Gromacs User Manual version 3.1. Nijenborgh 4, 9747 AG Groningen, The Netherlands. Internet: http://www.gromacs.org, 2001.
47
 
48
K. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. Hilfinger, S. Graham, D. Gay, P. Colella, and A. Aiken. Titanium: A high-performance Java dialect. In ACM 1998 Workshop on Java for High-Performance Network Computing, Stanford, California, 1998.


Collaborative Colleagues:
Mattan Erez: colleagues
Jung Ho Ahn: colleagues
Jayanth Gummaraju: colleagues
Mendel Rosenblum: colleagues
William J. Dally: colleagues