|
ABSTRACT
Diastolic arrays are arrays of processing elements that communicate exclusively through First-In First-Out (FIFO) queues. FIFO virtualization units enable relaxed timing of data transfers, and include hardware support to guarantee bandwidth and buffer space for all data transfers, which may follow composite paths through the network. We show that the architecture of diastolic arrays enables efficient synthesis from high-level specifications of communicating finite state machines so average throughput is maximized. Preliminary results are presented on an H.264 decoding benchmark.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
Michael I. Gordon , William Thies , Saman Amarasinghe, Exploiting coarse-grained task, data, and pipeline parallelism in stream programs, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
| |
4
|
B. W. Kernighan and S. Lin, "An efficient heuristic procedure for partitioning graphs," Bell System Technical Journal, vol. 49, pp. 291--307, February 1970.
|
| |
5
|
|
| |
6
|
|
| |
7
|
K. Fleming, C.-C. Lin, N. Dave, Arvind, G. Raghavan, and J. Hicks, "H.264 Decoding: A Case Study in Late Design-Cycle Changes," in Proceedings of the Sixth MEMOCODE Conference, 2008.
|
| |
8
|
M. Pellauer, M. Vijayaraghavan, M. Adler, J. Emer, and Arvind, "Quick Performance Models Quickly: Timing-Directed Simulation on FPGAs," in International Symposium on Performance Analysis of Systems and Software (ISPASS 2008), April 2008.
|
| |
9
|
M. H. Cho, "Diastolic Arrays: Throughput-Driven Reconfigurable Computing," Master's thesis, Massachusetts Institute of Technology, May 2008. {Online}. Available: http://csg.csail.mit.edu/pubs/memos/Memo-504/memo504.pdf
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
Elliot Waingold , Michael Taylor , Devabhaktuni Srikrishna , Vivek Sarkar , Walter Lee , Victor Lee , Jang Kim , Matthew Frank , Peter Finch , Rajeev Barua , Jonathan Babb , Saman Amarasinghe , Anant Agarwal, Baring It All to Software: Raw Machines, Computer, v.30 n.9, p.86-93, September 1997
[doi> 10.1109/2.612254]
|
| |
14
|
David Wentzlaff , Patrick Griffin , Henry Hoffmann , Liewei Bao , Bruce Edwards , Carl Ramey , Matthew Mattina , Chyi-Chang Miao , John F. Brown III , Anant Agarwal, On-Chip Interconnection Architecture of the Tile Processor, IEEE Micro, v.27 n.5, p.15-31, September 2007
[doi> 10.1109/MM.2007.89]
|
 |
15
|
Karthikeyan Sankaralingam , Ramadass Nagarajan , Haiming Liu , Changkyu Kim , Jaehyuk Huh , Doug Burger , Stephen W. Keckler , Charles R. Moore, Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture, Proceedings of the 30th annual international symposium on Computer architecture, June 09-11, 2003, San Diego, California
|
| |
16
|
Z. Yu, "High performance and energy efficient multi-core systems for DSP applications," Ph.D. dissertation, U. C. Davis, 2007.
|
| |
17
|
|
| |
18
|
|
| |
19
|
S. Murali, D. Atienz, L. Benini, and G. D. Micheli, "A Method for Routing Packets Across Multiple Paths in NoCs with In-Order Delivery and Fault-Tolerance Gaurantees," VLSI Design, vol. 2007, 2007.
|
CITED BY 2
|
|
Keun Sup Shim , Myong Hyon Cho , Michel Kinsy , Tina Wen , Mieszko Lis , G. Edward Suh , Srinivas Devadas, Static virtual channel allocation in oblivious routing, Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, p.38-43, May 10-13, 2009
|
|
|
|
|