| Memory access scheduling |
| Full text |
Pdf
(182 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 27th annual international symposium on Computer architecture
table of contents
Vancouver, British Columbia, Canada
Pages: 128 - 138
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
|
|
Authors
|
|
Scott Rixner
|
Electrical Engineering, Massachusetts Institute of Technology and Computer Systems Laboratory, Stanford University, Stanford, CA
|
|
William J. Dally
|
Computer Systems Laboratory, Stanford University, Stanford, CA
|
|
Ujval J. Kapasi
|
Computer Systems Laboratory, Stanford University, Stanford, CA
|
|
Peter Mattson
|
Computer Systems Laboratory, Stanford University, Stanford, CA
|
|
John D. Owens
|
Computer Systems Laboratory, Stanford University, Stanford, CA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 35, Downloads (12 Months): 116, Citation Count: 53
|
|
|
ABSTRACT
The bandwidth and latency of a memory system are strongly dependent on the manner in which accesses interact with the “3-D” structure of banks, rows, and columns characteristic of contemporary DRAM chips. There is nearly an order of magnitude difference in bandwidth between successive references to different columns within a row and different rows within a bank. This paper introduces memory access scheduling, a technique that improves the performance of a memory system by reordering memory references to exploit locality within the 3-D memory structure. Conservative reordering, in which the first ready reference in a sequence is performed, improves bandwidth by 40% for traces from five media benchmarks. Aggressive reordering, in which operations are scheduled to optimize memory bandwidth, improves bandwidth by 93% for the same set of applications. Memory access scheduling is particularly important for media processors where it enables the processor to make the most efficient use of scarce memory bandwidth.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Carter , W. Hsieh , L. Stoller , M. Swanson , L. Zhang , E. Brunvand , A. Davis , C.-C. Kuo , R. Kuramkote , M. Parker , L. Schaelicke , T. Tateyama, Impulse: Building a Smarter Memory Controller, Proceedings of the 5th International Symposium on High Performance Computer Architecture, p.70, January 09-12, 1999
|
| |
2
|
|
| |
3
|
|
 |
4
|
Vinodh Cuppu , Bruce Jacob , Brian Davis , Trevor Mudge, A performance comparison of contemporary DRAM architectures, Proceedings of the 26th annual international symposium on Computer architecture, p.222-233, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
5
|
|
| |
6
|
Sung I. Hong , Sally A. McKee , Maximo H. Salinas , Robert H. Klenke , James H. Aylor , Wm. A. Wulf, Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory, Proceedings of the 5th International Symposium on High Performance Computer Architecture, p.80, January 09-12, 1999
|
 |
7
|
|
| |
8
|
KANADE, TAKEO, KANO, HIROSHI, AND KIMURA, SHIGERU, Development of a Video-Rate Stereo Machine. In Proceedings of the International Robotics and Systems Conference (August 1995), pp. 95-100.
|
| |
9
|
|
| |
10
|
|
| |
11
|
MATTHEW, BINU K., ET AL., Design of a Parallel Vector Access Unit for SDRAM Memory Systems. In Proceedings of the Sixth International Symposium on High-Performance Computer Architecture (January 2000), pp. 39-48.
|
| |
12
|
|
| |
13
|
NEC Corporation. 128M-bit Synchronous DRAM 4-bank, LVTTL Data Sheet. Document No. M12650EJ5VODS00, 5th Edition, Revision K (July 1998).
|
| |
14
|
David Patterson , Thomas Anderson , Neal Cardwell , Richard Fromm , Kimberly Keeton , Christoforos Kozyrakis , Randi Thomas , Katherine Yelick, A Case for Intelligent RAM, IEEE Micro, v.17 n.2, p.34-44, March 1997
[doi> 10.1109/40.592312]
|
 |
15
|
Parthasarathy Ranganathan , Sarita Adve , Norman P. Jouppi, Performance of image and video processing with general-purpose processors and media ISA extensions, Proceedings of the 26th annual international symposium on Computer architecture, p.124-135, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
16
|
Scott Rixner , William J. Dally , Ujval J. Kapasi , Brucek Khailany , Abelardo López-Lagunas , Peter R. Mattson , John D. Owens, A bandwidth-efficient architecture for media processing, Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, p.3-13, November 1998, Dallas, Texas, United States
|
 |
17
|
Ashley Saulsbury , Fong Pong , Andreas Nowatzyk, Missing the memory wall: the case for processor/memory integration, Proceedings of the 23rd annual international symposium on Computer architecture, p.90-101, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
18
|
WATANABE, TAKEO, ET AL., Access Optimizer to Overcome the "Future Walls of Embedded DRAMs" in the Era of Systems on Silicon. In IEEE International Solid-State Circuits Conference Digest of Technical Papers (February 1999), pp. 370- 371.
|
CITED BY 53
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John D. Owens , William J. Dally , Ujval J. Kapasi , Scott Rixner , Peter Mattson , Ben Mowery, Polygon rendering on a stream architecture, Proceedings of the ACM SIGGRAPH/EUROGRAPHICS workshop on Graphics hardware, p.23-32, August 21-22, 2000, Interlaken, Switzerland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brucek Khailany , William J. Dally , Ujval J. Kapasi , Peter Mattson , Jinyung Namkoong , John D. Owens , Brian Towles , Andrew Chang , Scott Rixner, Imagine: Media Processing with Streams, IEEE Micro, v.21 n.2, p.35-46, March 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ravi Iyer , Li Zhao , Fei Guo , Ramesh Illikkal , Srihari Makineni , Don Newell , Yan Solihin , Lisa Hsu , Steve Reinhardt, QoS policies and architecture for cache/memory in CMP platforms, ACM SIGMETRICS Performance Evaluation Review, v.35 n.1, June 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hongzhong Zheng , Jiang Lin , Zhao Zhang , Eugene Gorbatov , Howard David , Zhichun Zhu, Mini-rank: Adaptive DRAM architecture for improving memory power efficiency, Proceedings of the 2008 41st IEEE/ACM International Symposium on Microarchitecture, p.210-221, November 08-12, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yossi Azar , Uriel Feige , Iftah Gamzu , Thomas Moscibroda , Prasad Raghavendra, Buffer management for colored packets with deadlines, Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, August 11-13, 2009, Calgary, AB, Canada
|
|
|
|
|