ACM Home Page
Please provide us with feedback. Feedback
Concurrency, latency, or system overhead: which has the largest impact on uniprocessor DRAM-system performance?
Full text PdfPdf (904 KB)
Source International Symposium on Computer Architecture archive
Proceedings of the 28th annual international symposium on Computer architecture table of contents
Göteborg, Sweden
Pages: 62 - 71  
Year of Publication: 2001
ISBN:0-7695-1162-7
Also published in ...
Authors
Vinodh Cuppu  Dept. of Electrical & Computer Engineering, University of Maryland, College Park
Bruce Jacob  Dept. of Electrical & Computer Engineering, University of Maryland, College Park
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS\TCCA : TC on Computer Arhitecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 39,   Citation Count: 11
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/379240.379252
What is a DOI?

ABSTRACT

Given a fixed CPU architecture and a fixed DRAM timing specification, there is still a large design space for a DRAM system organization. Parameters include the number of memory channels, the bandwidth of each channel, burst sizes, queue sizes and organizations, turnaround overhead, memory-controller page protocol, algorithms for assigning request priorities and scheduling requests dynamically, etc. In this design space, we see a wide variation in application execution times: for example, execution times for SPEC CPU 2000 integer suite on a 2-way ganged Direct Rambus organization (32 data bits) with 64-byte bursts are 10-20% lower than execution times on an otherwise identical configuration that uses 32-byte bursts. This represents two system configurations that are relatively close to each other in the design space; performance differences become even more pronounced for designs further apart.

This paper characterizes the sources of overhead in high-performance DRAM systems and investigates the most effective ways to reduce a system's exposure to performance loss. In particular, we look at mechanisms to increase a system's support for concurrent transactions, mechanisms to reduce request latency, and mechanisms to reduce the “system overhead”—the portion of the primary memory system's overhead that is not due to DRAM latency but rather to things like turnaround time, request queueing, inefficiencies due to read/write request interleaving, etc. Our simulator models a 2GHz, highly aggressive out-of-order uniprocessor. The interface to the memory system is fully non-blocking, supporting up to 32 outstanding misses at both the level-1 and level-2 caches and split-transaction busses to all DRAM banks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
W.R. Bryg, K. K. Chan, and N. S. Fiduccia. "A high-performance, low-cost multiprocessor bus for workstations and midrange servers." The Hewlett-PuckardJournal, vol. 47, no. 1, February 1996.
 
3
D. Burger and T. M. Austin. "The SimpleScalar tool set, version 2.0." Tech. Rep. CS- 1342, University of Wisconsin-Madison, June 1997.
 
4
5
 
6
 
7
B. Davis, T. Mudge, B. Jacob, and V. Cuppu. "DDR2 and low latency variants." In Proc. Memory Wall Workshop at the 26th Annual lnt 'l Symposium on Computer Architecture, Vancouver, Canada, May 2000.
 
8
K. Diefendorff. "Sony's emotionally charged chip: Killer floatingpoint 'Emotion Engine' to power PlayStation 2000." Microprocessor Report, vol. 13, no. 5, pp. 1-1 L, April 1999.
 
9
B. Dipert. "DRAM redesign: not just plastic surgery." EDN, vol. 1998, no. 14, pp. 20, July 1998.
 
10
B. Dipert. "The slammin, jammin, DRAM scramble." EDN, vol. 2000, no. 2, pp. 68-82, January 2000.
 
11
ESDRAM. EnhancedSDRAM IMx 16. Enhanced Memory Systems, Inc., http://www.edram.com/products/datasheets/l 6M_esdram0298a.pdf, 1998.
 
12
L. Gwennap. "Alpha 21364 to ease memory bottleneck: Compaq will add Direct RDRAM to 21264 core for late 2000 shipments." MicroprocessorReport, vol. 12, no. 14, pp. 12-15, October 1998.
 
13
 
14
T. R. Hotchkiss, N. D. Marschke, and K M. McColsky. "A new memory system design for commercial and technical computing products." The Hewlett-Packard Journal, vol. 47, no. L, February 1996.
 
15
16
 
17
 
18
B. Prince. High Per/brmance Memories. John Wiley and Sons, West Sussex, England, 1999.
 
19
S. Przybylski. "MoSys reveals MDRAM architecture." Microprocessor Report, vol. 9, no. 17, pp. 17-20, December 1995.
 
20
 
21
Rambus. Direct RDRAM 256/288-Mbit Data Sheet. Rambus, http://www.rambas.com/developer/downloads/rdram.256s.0060- 1.1 .book.pd f, 2000.
 
22
 
23
24
 
25
R. Wilson. "MoSys tries synthetic SRAM." EE Times Online, July 15, 1997, July 1997. http://www.eetimes.com/news/98/1017news/tries.html.

CITED BY  11

Collaborative Colleagues:
Vinodh Cuppu: colleagues
Bruce Jacob: colleagues