ACM Home Page
Please provide us with feedback. Feedback
High-bandwidth address translation for multiple-issue processors
Full text PdfPdf (1.56 MB)
Source International Symposium on Computer Architecture archive
Proceedings of the 23rd annual international symposium on Computer architecture table of contents
Philadelphia, Pennsylvania, United States
Pages: 158 - 167  
Year of Publication: 1996
ISBN:0-89791-786-3
Also published in ...
Authors
Todd M. Austin  Computer Sciences Department, University of Wisconsin-Madison, 1210 W. Dayton Street, Madison, WI
Gurindar S. Sohi  Computer Sciences Department, University of Wisconsin-Madison, 1210 W. Dayton Street, Madison, WI
Sponsors
IEEE-CS\TCCA : TC on Computer Arhitecture
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 0,   Downloads (12 Months): 21,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/232973.232990
What is a DOI?

ABSTRACT

In an effort to push the envelope of system performance, microprocessor designs are continually exploiting higher levels of instruction-level parallelism, resulting in increasing bandwidth demands on the address translation mechanism. Most current microprocessor designs meet this demand with a multi-ported TLB. While this design provides an excellent hit rate at each port, its access latency and area grow very quickly as the number of ports is increased. As bandwidth demands continue to increase, multi-ported designs will soon impact memory access latency.We present four high-bandwidth address translation mechanisms with latency and area characteristics that scale better than a multiported TLB design. We extend traditional high-bandwidth memory design techniques to address translation, developing interleaved and multi-level TLB designs. In addition, we introduce two new designs crafted specifically for high-bandwidth address translation. Piggyback ports are introduced as a technique to exploit spatial locality in simultaneous translation requests, allowing accesses to the same virtual memory page to combine their requests at the TLB access port. Pretranslation is introduced as a technique for attaching translations to base register values, making it possible to reuse a single translation many times.We perform extensive simulation-based studies to evaluate our designs. We vary key system parameters, such as processor model, page size, and number of architected registers, to see what effects these changes have on the relative merits of each approach. A number of designs show particular promise. Multi-level TLBs with as few as eight entries in the upper-level TLB nearly achieve the performance of a TLB with unlimited bandwidth. Piggyback ports combined with a lesser-ported TLB structure, e.g., an interleaved or multi-ported TLB, also perform well. Pretranslation over a single-ported TLB performs almost as well as a same-sized multi-level TLB with the added benefit of decreased access latency for physically indexed caches.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

BF92
 
BHIL94
BRG+89
CBJ92
CCH+87
 
Che87
R. Cheng. Virtual address caches in UNIX. Proc. of the Summer 1987 USENIX Technical Conference, pages 217-224, 1987.
CK92
CMMP95
 
EV93
R.J. Eickemeyer and S. Vassiliadis. A load-instruction unit for pipelined processors. IBM J. Res. Develop., 37(4):547-564, July 1993.
 
Gwe95
L. Gwennap. Hal reveals multichip SPARC processor. Mtcroprocessor Report, 9(3):1-11, March 1995.
 
Hea86
 
HHL+90
K. Hua. A. Hunt, L. Liu, J-K. Peir, D. Pruett, and J. Temple. Early resolution of address translation in cache design. Proc. of the 1990 IEEE International Conference on Computer Design, pages 408-412, September 1990.
 
HP90
 
Jol91
R. Jolly. A 9-ns 1.4 gigabyte/s, 17-ported CMOS register file. IEEE J. of Solid-State Circuits, 25:1407-1412, October 1991.
JW94
KCE92
 
KH92
KJLH89
 
LE89
 
LS94
Rau91
SF91
TH94
WBL89
 
WE88
YP93

CITED BY  14

Collaborative Colleagues:
Todd M. Austin: colleagues
Gurindar S. Sohi: colleagues