ACM Home Page
Please provide us with feedback. Feedback
Architectures and APIs: assessing requirements for delivering FPGA performance to applications
Full text HtmlHtml (2 KB),  PdfPdf (281 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2006 ACM/IEEE conference on Supercomputing table of contents
Tampa, Florida
SESSION: Technical papers table of contents
Article No. 111  
Year of Publication: 2006
ISBN:0-7695-2700-0
Authors
Keith D. Underwood  Sandia National Laboratories, Albuquerque, NM
K. Scott Hemmert  Sandia National Laboratories, Albuquerque, NM
Craig Ulmer  Sandia National Laboratories, Albuquerque, NM
Sponsors
IEEE : Institute of Electrical and Electronics Engineers
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1188455.1188571
What is a DOI?

ABSTRACT

Reconfigurable computing leveraging field programmable gate arrays (FPGAs) is one of many accelerator technologies that are being investigated for application to high performance computing (HPC). Like most accelerators, FPGAs are very efficient at both dense matrix multiplication and FFT computations, but two important aspects of how to deliver that performance to applications have received too little attention. First, the standard API for important compute kernels hides parallelism from the system. Second, the issue of system architecture is virtually never addressed. This paper explores both issues and their implications for applications. We find that high bandwidth, low latency connectivity can be important, but the right API can be even more important.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
Frigo, M., and Johnson, S. G. 1998. FFTW: An adaptive software architecture for the FFT. In Proceedings of International Conference on Acoustics, Speech and Signal Processing, vol. 3, 1381--1384.
 
5
Govindu, G., Zhuo, L., Choi, S., Gundala, P., and Prasanna, V. K. 2003. Area and power performance analysis of a floating-point based application on FPGAs. In Proceedings of the Seventh Annual Workshop on High Performance Embedded Computing (HPEC 2003).
 
6
Govindu, G., Choi, S., Prasanna, V. K., Daga, V., Gangadharpalli, S., and Sridhar, V. 2004. A high-performance and energy-efficient architecture for floating-point based lu decomposition on fpgas. In Proceedings of the 11th Reconfigurable Architectures Workshop (RAW).
 
7
Govindu, G., Zhuo, L., Choi, S., Gundala, P., and Prasanna, V. K. 2004. Analysis of high-performance floating-point arithmetic on FPGAs. In Proceedings of the 11th Reconfigurable Architectures Workshop (RAW).
 
8
 
9
Janssen, C. Personal communications.
 
10
 
11
 
12
Plimpton, S. J., Pollock, R., and Stevens, M. 1997. Particle-mesh ewald and rRESPA for parallel molecular dynamics. In Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing.
 
13
 
14
 
15
 
16
17
 
18
Williamson, D. L., Drake, J. B., Hack, J. J., Jakob, R., and Swarztrauber, P. N. 1992. A standard test set for numerical approximations to the shallow water equations in spherical geometry. J. Comput. Phys. 102, 211--224.
 
19
Zhuo, L., and Prasanna, V. K. 2004. Scalable and modular algorithms for floating-point matrix multiplication on fpgas. In 18th International Parallel and Distributed Processing Symposium (IPDPS'04).
 
20
 
21
22


Collaborative Colleagues:
Keith D. Underwood: colleagues
K. Scott Hemmert: colleagues
Craig Ulmer: colleagues