| Maps: a compiler-managed memory system for raw machines |
| Full text |
Pdf
(232 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 26th annual international symposium on Computer architecture
table of contents
Atlanta, Georgia, United States
Pages: 4 - 15
Year of Publication: 1999
ISBN:0-7695-0170-2
Also published in ...
|
|
Authors
|
|
Rajeev Barua
|
M.I.T. Laboratory for Computer Science, Cambridge, MA
|
|
Walter Lee
|
M.I.T. Laboratory for Computer Science, Cambridge, MA
|
|
Saman Amarasinghe
|
M.I.T. Laboratory for Computer Science, Cambridge, MA
|
|
Anant Agarwal
|
M.I.T. Laboratory for Computer Science, Cambridge, MA
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society
Washington, DC, USA
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 24, Citation Count: 19
|
|
|
ABSTRACT
This paper describes Maps, a compiler managed memory system for Raw architectures. Traditional processors for sequential programs maintain the abstraction of a unified memory by using a single centralized memory system. This implementation leads to the infamous "Von Neumann bottleneck," with machine performance limited by the large memory latency and limited memory bandwidth. A Raw architecture addresses this problem by taking advantage of the rapidly increasing transistor budget to move much of its memory on chip. To remove the bottleneck and complexity associated with centralized memory, Raw distributes the memory with its processing elements. Unified memory semantics are implemented jointly by the hardware and the compiler. The hardware provides a clean compiler interface to its two inter-tile interconnects: a fast, statically schedulable network and a traditional dynamic network. Maps then uses these communication mechanisms to orchestrate the memory accesses for low latency and parallelism while enforcing proper dependence. It optimizes for speed in two ways: by finding accesses that can be scheduled on the static interconnect through static promotion, and by minimizing dependence sequentialization for the remaining accesses. Static promotion is performed using equivalence class unification and modulo unrolling; memory dependences are enforced through explicit synchronization and software serial ordering. We have implemented Maps based on the SUIF infrastructure. This paper demonstrates that the exclusive use of static promotion yields roughly 20-fold speedup on 32 tiles for our regular applications and about 5-fold speedup on 16 or more tiles for our irregular applications. The paper also shows that selective use of dynamic accesses can be a useful complement to the mostly static memory system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Doug Burger , James R. Goodman , Alain Kägi, Memory bandwidth limitations of future microprocessors, Proceedings of the 23rd annual international symposium on Computer architecture, p.78-89, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
3
|
|
 |
4
|
Sandhya Dwarkadas , Alan L. Cox , Willy Zwaenepoel, An integrated compile-time/run-time software distributed shared memory system, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.186-197, October 01-04, 1996, Cambridge, Massachusetts, United States
|
 |
5
|
|
| |
6
|
Mary W. Hall , Jennifer M. Anderson , Saman P. Amarasinghe , Brian R. Murphy , Shih-Wei Liao , Edouard Bugnion , Monica S. Lam, Maximizing Multiprocessor Performance with the SUIF Compiler, Computer, v.29 n.12, p.84-89, December 1996
[doi> 10.1109/2.546613]
|
 |
7
|
Walter Lee , Rajeev Barua , Matthew Frank , Devabhaktuni Srikrishna , Jonathan Babb , Vivek Sarkar , Saman Amarasinghe, Space-time scheduling of instruction-level parallelism on a raw machine, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.46-57, October 02-07, 1998, San Jose, California, United States
|
| |
8
|
P. Geoffrey Lowney , Stefan M. Freudenberger , Thomas J. Karzes , W. D. Lichtenstein , Robert P. Nix , John S. O'Donnell , John Ruttenberg, The multiflow trace scheduling compiler, The Journal of Supercomputing, v.7 n.1-2, p.51-142, May 1993
[doi> 10.1007/BF01205182]
|
| |
9
|
|
 |
10
|
Dror E. Maydan , John L. Hennessy , Monica S. Lam, Efficient and exact data dependence analysis, Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, p.1-14, June 24-28, 1991, Toronto, Ontario, Canada
|
 |
11
|
Shubhendu S. Mukherjee , Shamik D. Sharma , Mark D. Hill , James R. Larus , Anne Rogers , Joel Saltz, Efficient support for irregular applications on distributed-memory machines, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.68-79, July 19-21, 1995, Santa Barbara, California, United States
|
 |
12
|
|
 |
13
|
Daniel J. Scales , Kourosh Gharachorloo , Chandramohan A. Thekkath, Shasta: a low overhead, software-only approach for supporting fine-grain shared memory, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.174-185, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
14
|
M. D. Smith. Extending SUIF for Machine-dependent Optimizations. In Proceedings of the First SUIF Compiler Workshop, pages 14--25, Stanford, CA, Jan. I996.
|
| |
15
|
Elliot Waingold , Michael Taylor , Devabhaktuni Srikrishna , Vivek Sarkar , Walter Lee , Victor Lee , Jang Kim , Matthew Frank , Peter Finch , Rajeev Barua , Jonathan Babb , Saman Amarasinghe , Anant Agarwal, Baring It All to Software: Raw Machines, Computer, v.30 n.9, p.86-93, September 1997
[doi> 10.1109/2.612254]
|
 |
16
|
Robert P. Wilson , Robert S. French , Christopher S. Wilson , Saman P. Amarasinghe , Jennifer M. Anderson , Steve W. K. Tjiang , Shih-Wei Liao , Chau-Wen Tseng , Mary W. Hall , Monica S. Lam , John L. Hennessy, SUIF: an infrastructure for research on parallelizing and optimizing compilers, ACM SIGPLAN Notices, v.29 n.12, p.31-37, Dec. 1994
[doi> 10.1145/193209.193217]
|
CITED BY 19
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Michael Bedford Taylor , Walter Lee , Jason Miller , David Wentzlaff , Ian Bratt , Ben Greenwald , Henry Hoffmann , Paul Johnson , Jason Kim , James Psota , Arvind Saraf , Nathan Shnidman , Volker Strumpen , Matt Frank , Saman Amarasinghe , Anant Agarwal, Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams, ACM SIGARCH Computer Architecture News, v.32 n.2, p.2, March 2004
|
|
|
|
|
|
|
|