ACM Home Page
Please provide us with feedback. Feedback
Tartan: evaluating spatial computation for whole program execution
Full text PdfPdf (319 KB)
Source Architectural Support for Programming Languages and Operating Systems archive
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems table of contents
San Jose, California, USA
SESSION: Scheduling and spatial programming table of contents
Pages: 163 - 174  
Year of Publication: 2006
ISBN:1-59593-451-0
Also published in ...
Authors
Mahim Mishra  Carnegie Mellon University, Pittsburgh, PA
Timothy J. Callahan  Carnegie Mellon University, Pittsburgh, PA
Tiberiu Chelcea  Carnegie Mellon University, Pittsburgh, PA
Girish Venkataramani  Carnegie Mellon University, Pittsburgh, PA
Seth C. Goldstein  Carnegie Mellon University, Pittsburgh, PA
Mihai Budiu  Microsoft Research
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
SIGPLAN: ACM Special Interest Group on Programming Languages
SIGOPS: ACM Special Interest Group on Operating Systems
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 97,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1168857.1168878
What is a DOI?

ABSTRACT

Spatial Computing (SC) has been shown to be an energy-efficient model for implementing program kernels. In this paper we explore the feasibility of using SC for more than small kernels. To this end, we evaluate the performance and energy efficiency of entire applications on Tartan, a general-purpose architecture which integrates a reconfigurable fabric (RF) with a superscalar core. Our compiler automatically partitions and compiles an application into an instruction stream for the core and a configuration for the RF. We use a detailed simulator to capture both timing and energy numbers for all parts of the system.Our results indicate that a hierarchical RF architecture, designed around a scalable interconnect, is instrumental in harnessing the benefits of spatial computation. The interconnect uses static configuration and routing at the lower levels and a packet-switched, dynamically-routed network at the top level. Tartan is most energyefficient when almost all of the application is mapped to the RF, indicating the need for the RF to support most general-purpose programming constructs. Our initial investigation reveals that such a system can provide, on average, an order of magnitude improvement in energy-delay compared to an aggressive superscalar core on single-threaded workloads.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
 
4
M. Budiu, P.V. Artigas, et al. Dataflow: A complement to superscalar. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 177--186, March 20-22 2005.
 
5
 
6
M. Budiu and S.C. Goldstein. Pegasus: An efficient intermediate representation. Technical Report CMU-CS-02-107, Carnegie Mellon University, May 2002.
 
7
8
9
 
10
 
11
12
 
13
 
14
15
 
16
 
17
J.R. Heath, P.J. Kuekes, et al. A defect-tolerant computer architecture: Opportunities for nanotechnology. Science, 280, 1998.
 
18
 
19
Intel Corp. Intel Pentium M Datasheet, January 2006.
20
21
 
22
E. Larson, S. Chatterjee, et al. MASE: A novel architecture or detailed microarchitectural modeling. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), November 4-6 2001.
 
23
24
25
 
26
B.J. Nelson. Remote procedure call. Technical Report CSL-81-9, Xerox Palo Alto Research Center, 1981.
 
27
28
 
29
30
 
31
H. Schmit, D. Whelihan, et al. Piperench: A virtualized programmable datapath in 0.18 micron technology. In IEEE Custom Integrated Circuits Conference, pages 63--66, 2002.
 
32
33
 
34
Standard Performance Evaluation Corp. SPEC INT 95 Benchmark Suite, 1995.
 
35
Standard Performance Evaluation Corp. SPEC INT 2000 Benchmark Suite, 2000.
36
 
37
38
 
39
40
41
 
42
G. Venkataramani, M. Budiu, et al. C to asynchronous dataflow circuits: An end-to-end toolflow. In International Workshop on Logic Synthesis, June 2004.
 
43
G. Venkataramani, T. Chelcea, et al. HLS support for unconstrained memory accesses. In International Workshop on Logic Syntheis, June 2005.
 
44
M. Wazlowski, L. Agarwal, et al. PRISM-II compiler and architecture. In IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 9--16, Apr 1993.
 
45
C. Wong, A. Martin, et al. An architecture for asynchronous FPGAs. In Proceedings of Field Programmable Technology (FPT), pages 170--177, 2003.
46


Collaborative Colleagues:
Mahim Mishra: colleagues
Timothy J. Callahan: colleagues
Tiberiu Chelcea: colleagues
Girish Venkataramani: colleagues
Seth C. Goldstein: colleagues
Mihai Budiu: colleagues