ACM Home Page
Please provide us with feedback. Feedback
Modeling instruction placement on a spatial architecture
Full text PdfPdf (388 KB)
Source ACM Symposium on Parallel Algorithms and Architectures archive
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures table of contents
Cambridge, Massachusetts, USA
SESSION: Processing and scheduling table of contents
Pages: 158 - 169  
Year of Publication: 2006
ISBN:1-59593-452-9
Authors
Martha Mercaldi  University of Washington, Seattle, WA
Steven Swanson  University of Washington, Seattle, WA
Andrew Petersen  University of Washington, Seattle, WA
Andrew Putnam  University of Washington, Seattle, WA
Andrew Schwerin  University of Washington, Seattle, WA
Mark Oskin  University of Washington, Seattle, WA
Susan J. Eggers  University of Washington, Seattle, WA
Sponsors
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 31,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1148109.1148137
What is a DOI?

ABSTRACT

In response to current technology scaling trends, architects are developing a new style of processor, known as spatial computers. A spatial computer is composed of hundreds or even thousands of simple, replicated processing elements (or PEs), frequently organized into a grid. Several current spatial computers, such as TRIPS, RAW, SmartMemories, nanoFabrics and WaveScalar, explicitly place a program's instructions onto the grid. Designing instruction placement algorithms is an enormous challenge, as there are an exponential (in the size of the application) number of different mappings of instructions to PEs, and the choice of mapping greatly affects program performance. In this paper we develop an instruction placement performance model which can inform instruction placement. The model comprises three components, each of which captures a different aspect of spatial computing performance: inter-instruction operand latency, data cache coherence overhead, and contention for processing element resources. We evaluate the model on one spatial computer, WaveScalar, and find that predicted and actual performance correlate with a coefficient of -0.90. We demonstrate the model's utility by using it to design a new placement algorithm, which outperforms our previous algorithms. Although developed in the context of WaveScalar, the model can serve as a foundation for tuning code, compiling software, and understanding the microarchitectural trade-offs of spatial computers in general.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
A. P. Bohm and J. Sargeant. Efficient dataflow code generation for sisal. Technical Reports UMCS-85-10-2, Department of Computer Science, University of Manchester, Oct. 1985.
 
5
 
6
C. Chang, J. Cong, D. Pan, and X. Yuan. Multilevel global placement with congestion control, 2003.
7
8
 
9
10
11
12
 
13
 
14
L. Eeckhout, K. D. Bosschere, and H. Neefs. Performance analysis through synthetic trace generation.
15
 
16
 
17
18
19
20
 
21
22
23
 
24
C. Lin. ZPL Language Reference Manual. UW-CSE-TR 94-10-06, University of Washington, 1996.
 
25
J. Lo, S. Eggers, H. Levy, and D. Tullsen. Compilation issues for a simultaneous multithreading processor, 1996.
26
 
27
R. Nikhil. ID Version 88.1, Reference Manual. MIT, Laboratory for Computer Science, Cambridge, MA, 90.1 edition, 1991.
 
28
D. B. Noonburg and J. P. Shen. A framework for statistical modeling of superscalar processor performance.
 
29
S. Nussbaum and J. Smith. Modeling superscalar processors via statistical simulation, 2001.
30
31
32
33
34
 
35
F. I. Scalability. Design and analysis of routed inter-alu networks.
36
 
37
 
38
SPEC. Spec CPU 2000 benchmark specifications. SPEC2000 Benchmark Release, 2000.
 
39
 
40
S. Swanson, A. Putnam, M. Mercaldi, K. Michelson, A. Petersen, A. Schwerin, M. Oskin, and S. Eggers. The wavescalar architecture.
41
 
42
 
43
W. Thies, M. Karczmarek, and S. P. Amarasinghe. Streamit: A language for streaming applications. In Computational Complexity.
44
 
45


Collaborative Colleagues:
Martha Mercaldi: colleagues
Steven Swanson: colleagues
Andrew Petersen: colleagues
Andrew Putnam: colleagues
Andrew Schwerin: colleagues
Mark Oskin: colleagues
Susan J. Eggers: colleagues