|
ABSTRACT
The current methodology used in mass-market processor design is to create a single base microarchitecture (e.g., Intel's ``Core'' or AMD's ``K8'') that is used throughout all of the PC market segments from laptops to servers. To differentiate the products, manufacturers rely on speed binning, different cache sizes, and varying the number of cores. In this paper, we propose using 3D integration to provide a new, but complementary, approach to providing product differentiation. Past research on using 3D to improve performance has focused on the construction of "fully 3D" circuits where functional blocks are partitioned across two or more layers. This approach forces one of two undesirable situations: (1) all products must be implemented in, and therefore pay the cost of, 3D or (2) a 3D-implemented processor is designed for the high-end/high-performance markets and a separate 2D microarchitecture must be designed for the lower-cost markets thereby incurring significant additional design effort and engineering cost. We present a modular processor architecture where 3D can be used to enhance performance within a single unified design and also provides for a more gradual migration path toward fully 3D-integrated designs. To make this work, we describe a generic technique of using "phantom" components where the baseline processor may believe that 3D-stacked resources exist, but are currently unavailable. Simply using 3D to stack more L2 cache provides a 15.1% average performance benefit, but our proposal increases performance by 25.4%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
Bryan Black , Murali Annavaram , Ned Brekelbaum , John DeVale , Lei Jiang , Gabriel H. Loh , Don McCaule , Pat Morrow , Donald W. Nelson , Daniel Pantuso , Paul Reed , Jeff Rupley , Sadasivan Shankar , John Shen , Clair Webb, Die Stacking (3D) Microarchitecture, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, p.469-479, December 09-13, 2006
[doi> 10.1109/MICRO.2006.18]
|
| |
5
|
|
 |
6
|
|
| |
7
|
Jason Cong , Ashok Jagannathan , Yuchun Ma , Glenn Reinman , Jie Wei , Yan Zhang, An automated design flow for 3D microarchitecture evaluation, Proceedings of the 2006 conference on Asia South Pacific design automation, January 24-27, 2006, Yokohama, Japan
[doi> 10.1145/1118299.1118395]
|
 |
8
|
Shamik Das , Andy Fan , Kuan-Neng Chen , Chuan Seng Tan , Nisha Checka , Rafael Reif, Technology, performance, and computer-aided design of three-dimensional integrated circuits, Proceedings of the 2004 international symposium on Physical design, April 18-21, 2004, Phoenix, Arizona, USA
[doi> 10.1145/981066.981091]
|
| |
9
|
Jack Doweck. Inside Intel Core Microarchitecture and Smart Memory Access. White paper, Intel Corporation, 2006. http://download.intel.com/technology/architecture/sma.pdf.
|
 |
10
|
|
| |
11
|
Andy Glew. MLP Yes! ILP No! Memory Level Parallelism, or, Why I No Longer Worry About IPC. In Proceedings of the ASPLOS Wild and Crazy Ideas Session, San Jose, CA, USA, October 1997.
|
| |
12
|
Simcha Gochman, Ronny Ronen, Ittai Anati, Ariel Berkovitz, Tsvika Kurts, Alon Naveh, Ali Saeed, Zeev Sperber, and Robert C. Valentine. The Intel Pentium M Processor: Microarchitecture and Performance. Intel Technology Journal, 7(2), May 2003.
|
 |
13
|
|
| |
14
|
|
| |
15
|
K. W. Guarini, A. W. Topol, M. Ieong, R. Yu, L. Shi, M. R. Newport, D. J. Frank, D. V. Singh, G. M. Cohen, S. V. Nitta, D. C. Boyd, P. A. O-Neil, S. L. Tempest, H. B. Pogge, S. Purushothaman, and W. E. Haensch. Electrical Integrity of State-of-the-Art 0.13?m SOI CMOS Devices and Circuits Transferred for Three-Dimensional (3D) Integrated Circuit (IC) Fabrication. In Proceedings of the International Electron Devices Meeting, pages 943--945, December 2002.
|
| |
16
|
Greg Hamerly, Erez Perelman, Jeremy Lau, and Brad Calder. SimPoint 3.0: Faster and More Flexible Program Analysis. In Proceedings of the Workshop on Modeling, Benchmarking and Simulation, Madison, WI, USA, June 2005.
|
| |
17
|
Michael Healy, Mario Vittes, Mongkol Ekpanyapong, Chinnakrishnan Ballapuram, Sung Kyu Lim, Hsien-Hsin S. Lee, and Gabriel H. Loh. Multi-Objective Microarchitectural Floorplanning for 2D and 3D ICs. To appear in the IEEE Transactions on Computer Aided Design, 2007.
|
| |
18
|
|
 |
19
|
Taeho Kgil , Shaun D'Souza , Ali Saidi , Nathan Binkert , Ronald Dreslinski , Trevor Mudge , Steven Reinhardt , Krisztian Flautner, PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
 |
20
|
Jongman Kim , Chrysostomos Nicopoulos , Dongkook Park , Reetuparna Das , Yuan Xie , Vijaykrishnan Narayanan , Mazin S. Yousif , Chita R. Das, A novel dimensionally-decomposed router for on-chip communication in 3D architectures, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
 |
21
|
Gurhan Kucuk , Kanad Ghose , Dimitry V. Ponomarev , Peter M. Kogge, Energy: efficient instruction dispatch buffer design for superscalar processors, Proceedings of the 2001 international symposium on Low power electronics and design, p.237-242, August 2001, Huntington Beach, California, United States
[doi> 10.1145/383082.383144]
|
 |
22
|
Feihui Li , Chrysostomos Nicopoulos , Thomas Richardson , Yuan Xie , Vijaykrishnan Narayanan , Mahmut Kandemir, Design and Management of 3D Chip Multiprocessors Using Network-in-Memory, Proceedings of the 33rd annual international symposium on Computer Architecture, p.130-141, June 17-21, 2006
|
 |
23
|
Gian Luca Loi , Banit Agrawal , Navin Srivastava , Sheng-Chih Lin , Timothy Sherwood , Kaustav Banerjee, A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy, Proceedings of the 43rd annual conference on Design automation, July 24-28, 2006, San Francisco, CA, USA
[doi> 10.1145/1146909.1147160]
|
| |
24
|
|
 |
25
|
John Mayega , Okan Erdogan , Paul M. Belemjian , Kuan Zhou , John F. McDonald , Russel P. Kraft, 3D direct vertical interconnect microprocessors test vehicle, Proceedings of the 13th ACM Great Lakes symposium on VLSI, April 28-29, 2003, Washington, D. C., USA
[doi> 10.1145/764808.764846]
|
| |
26
|
Scott McFarling. Combining Branch Predictors. TN 36, Compaq Computer Corporation Western Research Laboratory, June 1993.
|
| |
27
|
|
 |
28
|
Shashidhar Mysore , Banit Agrawal , Navin Srivastava , Sheng-Chih Lin , Kaustav Banerjee , Tim Sherwood, Introspective 3D chips, Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, October 21-25, 2006, San Jose, California, USA
|
| |
29
|
Shashidhar Mysore , Banit Agrawal , Navin Srivastava , Sheng-Chih Lin , Kaustav Banerjee , Timothy Sherwood, 3D Integration for Introspection, IEEE Micro, v.27 n.1, p.77-83, January 2007
[doi> 10.1109/MM.2007.1]
|
| |
30
|
Don Nelson, Clair Webb, Don McCauley, Kartik Raol, Jeff Rupley II, John DeVale, and Bryan Black. A 3D Interconnect Methodology Applied to iA32-class Architectures for Performance Improvements through RC Mitigation. In Proceedings of the 21st International VLSI Multilevel Interconnection Conference, Waikoloa Beach, HI, USA, September 2004.
|
| |
31
|
|
| |
32
|
|
 |
33
|
|
| |
34
|
|
 |
35
|
|
 |
36
|
|
| |
37
|
|
| |
38
|
Paul Reed, Gus Yeung, and Bryan Black. Design Aspects of a Microprocessor Data Cache using 3D Die Interconnect Technology. In Proceedings of the International Conference on Integrated Circuit Design and Technology, pages 15--18, Austin, TX, USA, May 2005.
|
| |
39
|
Amir Roth. Store Vulnerability Window (SVW): A Filter and Potential Replacement for Load Re-Execution. Journal of Instruction Level Parallelism, 8, 2006.
|
| |
40
|
Stefan Rusu, Jason Stinson, Simon Tam, Justin Leung, Harry Muljono, and Brian Cherkauer. A 1.5-Ghz 130-nm Itanium 2 Processor with 6-MB On-Die L3 Cache. IEEE Journal of Solid-State Circuits, 38(11):1887--1895, November 2003.
|
 |
41
|
Peter G. Sassone , Jeff Rupley, II , Edward Brekelbaum , Gabriel H. Loh , Bryan Black, Matrix scheduler reloaded, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
 |
42
|
Gerhard Schrom , Peter Hazucha , Jae-Hong Hahn , Volkan Kursun , Donald Gardner , Siva Narendra , Tanay Karnik , Vivek De, Feasibility of monolithic and 3D-stacked DC-DC converters for microprocessors in 90nm technology generation, Proceedings of the 2004 international symposium on Low power electronics and design, August 09-11, 2004, Newport Beach, California, USA
[doi> 10.1145/1013235.1013302]
|
| |
43
|
|
| |
44
|
Andé Seznec and Pierre Michaud. A Case for (Partially) TAgges GEometric History Length Branch Prediction. Journal of Instruction Level Parallelism, 8:1--23, 2006.
|
| |
45
|
|
| |
46
|
|
 |
47
|
Kevin Skadron , Mircea R. Stan , Wei Huang , Sivakumar Velusamy , Karthik Sankaranarayanan , David Tarjan, Temperature-aware microarchitecture, Proceedings of the 30th annual international symposium on Computer architecture, June 09-11, 2003, San Diego, California
|
| |
48
|
Timothy J. Slegel , Robert M. Averill III , Mark A. Check , Bruce C. Giamei , Barry W. Krumm , Christopher A. Krygowski , Wen H. Li , John S. Liptay , John D. MacDougall , Thomas J. McPherson , Jennifer A. Navarro , Eric M. Schwarz , Kevin Shum , Charles F. Webb, IBM's S/390 G5 Microprocessor Design, IEEE Micro, v.19 n.2, p.12-23, March 1999
[doi> 10.1109/40.755464]
|
| |
49
|
|
| |
50
|
|
| |
51
|
|
| |
52
|
David Tarjan, Shyamkumar Thoziyoor, and Norman P. Jouppi. CACTI 4.0. Technical Report HPL-2006-86, HP Laboratories Palo Alto, June 2006.
|
| |
53
|
|
| |
54
|
Balaji Vaidyanathan , Wei-Lun Hung , Feng Wang , Yuan Xie , Vijaykrishnan Narayanan , Mary Jane Irwin, Architecting Microprocessor Components in 3D Design Space, Proceedings of the 20th International Conference on VLSI Design held jointly with 6th International Conference: Embedded Systems, p.103-108, January 06-10, 2007
[doi> 10.1109/VLSID.2007.41]
|
 |
55
|
|
|