ACM Home Page
Please provide us with feedback. Feedback
Substituting associative load queue with simple hash tables in out-of-order microprocessors
Full text PdfPdf (175 KB)
Source International Symposium on Low Power Electronics and Design archive
Proceedings of the 2006 international symposium on Low power electronics and design table of contents
Tegernsee, Bavaria, Germany
SESSION: Memory hierarchy and caches table of contents
Pages: 268 - 273  
Year of Publication: 2006
ISBN:1-59593-462-6
Authors
Alok Garg  University of Rochester
Fernando Castro  Universidad Complutense Madrid
Michael Huang  University of Rochester
Daniel Chaver  Universidad Complutense Madrid
Luis Piñuel  Universidad Complutense Madrid
Manuel Prieto  Universidad Complutense Madrid
Sponsors
ACM: Association for Computing Machinery
SIGDA: ACM Special Interest Group on Design Automation
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 20,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1165573.1165637
What is a DOI?

ABSTRACT

Buffering more in-flight instructions in an out-of-order microprocessor is a straightforward and effective method to help tolerate the long latencies generally associated with off-chip memory accesses. One of the main challenges of buffering a large number of instructions, however, is the implementation of a scalable and efficient mechanism to detect memory access order violations as a result of out-of-order scheduling of load and store instructions. Traditional CAM-based associative queues can be very slow and energy consuming. In this paper, instead of using the traditional age-based load queue to record load addresses, we explicitly record age information in address-indexed hash tables to achieve the same functionality of detecting premature loads. This alternative design eliminates associative searches and significantly reduces the energy consumption of the load queue. With simple techniques to reduce the number of false positives, performance degradation is kept at a minimum.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
D. Burger and T. Austin. The SimpleScalar Tool Set, Version 2.0. Technical report 1342, Computer Sciences Department, University of Wisconsin-Madison, June 1997.
4
 
5
F. Castro, D. Chaver, L. Pinuel, M. Prieto, M. Huang, and F. Tirado. A Power-Efficient and Scalable Load-Store Queue Design. In International Workshop on Power And Timing Modeling, Optimization and Simulation. September 2005. Lecture Notes in Computer Science Vol. 2236(8):1--9.
 
6
 
7
Compaq Computer Corporation. Alpha 21264/EV6 Microprocessor Hardware Reference Manual, September 2000. Order number: DS-0027B-TE.
8
9
 
10
R. Huang, A. Garg, and M. Huang. Software-Hardware Cooperative Memory Disambiguation. In International Symposium on High-Performance Computer Architecture. February 2006.
 
11
12
 
13
 
14
 
15
 
16
J. Tendler, J. Dodson, J. Fields, H. Le, and B. Sinharoy. POWER4 System Microarchitecture. IBM Journal of Research and Development, Vol. 46(1):5--25, January 2002.
17


Collaborative Colleagues:
Alok Garg: colleagues
Fernando Castro: colleagues
Michael Huang: colleagues
Daniel Chaver: colleagues
Luis Piñuel: colleagues
Manuel Prieto: colleagues