|
ABSTRACT
This paper explores the concept of micro-architectural loops and discusses their impact on processor pipelines. In particular, we establish the relationship between loose loops and pipeline length and configuration,and show their impact on performance. We then evaluate the load resolution loop in detail and propose the distributed register algorithm (DRA) as a way of reducing this loop. It decreases the performance loss due to load mis-speculations by reducing the issue-to-execute latency in the pipeline. A new loose loop is introduced into the pipeline by the DRA, but the frequency of mis-speculations is very low. The reduction in latency from issue to execute, along with a low mis-speculation rate in the DRA result in up to a 4% to 15% improvement in performance using a detailed architectural simulator.
CITED BY 53
|
|
|
|
|
|
|
|
Steven Hsu , Shih-Lien Lu , Shih-Chang Lai , Ram Krishnamurthy , Konrad Lai, Dynamic addressing memory arrays with physical locality, Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, November 18-22, 2002, Istanbul, Turkey
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ashok Jagannathan , Hannah Honghua Yang , Kris Konigsfeld , Dan Milliron , Mosur Mohan , Michail Romesis , Glenn Reinman , Jason Cong, Microarchitecture evaluation with floorplanning and interconnect pipelining, Proceedings of the 2005 conference on Asia South Pacific design automation, January 18-21, 2005, Shanghai, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
R. González , A. Cristal , M. Pericas , M. Valero , A. Veidenbaum, An asymmetric clustered processor based on value content, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
Jason Cong , Ashok Jagannathan , Yuchun Ma , Glenn Reinman , Jie Wei , Yan Zhang, An automated design flow for 3D microarchitecture evaluation, Proceedings of the 2006 conference on Asia South Pacific design automation, January 24-27, 2006, Yokohama, Japan
|
|
|
|
|
|
|
|
|
Oguz Ergin , Deniz Balkan , Kanad Ghose , Dmitry Ponomarev, Register Packing: Exploiting Narrow-Width Operands for Reducing Register File Pressure, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.304-315, December 04-08, 2004, Portland, Oregon
|
|
|
Eric Tune , Rakesh Kumar , Dean M. Tullsen , Brad Calder, Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.183-194, December 04-08, 2004, Portland, Oregon
|
|
|
|
|
|
Deniz Balkan , Joseph Sharkey , Dmitry Ponomarev , Kanad Ghose, Selective writeback: exploiting transient values for energy-efficiency and performance, Proceedings of the 2006 international symposium on Low power electronics and design, October 04-06, 2006, Tegernsee, Bavaria, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Deniz Balkan , Joseph Sharkey , Dmitry Ponomarev , Kanad Ghose, SPARTAN: speculative avoidance of register allocations to transient values for performance and energy efficiency, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David Ródenas , Xavier Martorell , Eduard Ayguadé , Jesús Labarta , George Almási , Călin Caşcaval , José Castaños , José Moreira, Exploiting multilevel parallelism using OpenMP on a massive multithreaded architecture, Journal of Embedded Computing, v.2 n.2, p.141-155, April 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Srinath Sridharan , Michael DeBole , Guangyu Sun , Yuan Xie , Vijaykrishnan Narayanan, A criticality-driven microarchitectural three dimensional (3D) floorplanner, Proceedings of the 2009 Conference on Asia and South Pacific Design Automation, January 19-22, 2009, Yokohama, Japan
|
|