|
ABSTRACT
While the number of transistors on a chip increases exponentially over time, the productivity that can be realized from these systems has not kept pace. To deal with the complexity of modern systems, software developers are increasingly dependent on specialized development tools such as security profilers, memory leak identifiers, data flight recorders, and dynamic type analysis. Many of these tools require full-system data which covers multiple interacting threads, processes, and processors. Reducing the performance penalty and complexity of these software tools is critical to those developing next generation applications, and many researchers have proposed adding specialized hardware to assist in profiling and introspection. Unfortunately, while this additional hardware would be incredibly beneficial to developers, the cost of this hardware must be paid on every single die that is manufactured.In this paper, we argue that a new way to attack this problem is with the addition of specialized analysis hardware built on separate active layers stacked vertically on the processor die using 3D IC technology. This provides a modular "snap-on" functionality that could be included with developer systems, and omitted from consumer systems to keep the cost impact to a minimum. In this paper we describe the advantage of using inter-die vias for introspection and we quantify the impact they can have in terms of the area, power, temperature, and routability of the resulting systems. We show that hardware stubs could be inserted into commodity processors at design time that would allow analysis layers to be bonded to development chips, and that these stubs would increase area and power by no more than 0.021mm2 and 0.9% respectively.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
International Technology Roadmap for Semiconductors, 2001.
|
| |
2
|
Workshop on Hardware Performance Monitor Design and Functionality in conjunction with HPCA-11, 2005.
|
| |
3
|
N. Goldsman A. Akturk and G.Metze. Self-Consistent Modeling of Heating and MOSFET Performance in 3-D Integrated Circuits. IEEE Transactions on Electron Devices, 52(11):2395--2403, 2005.
|
| |
4
|
Cristinel Ababei , Yan Feng , Brent Goplen , Hushrav Mogal , Tianpei Zhang , Kia Bazargan , Sachin Sapatnekar, Placement and Routing in 3D Integrated Circuits, IEEE Design & Test, v.22 n.6, p.520-531, November 2005
[doi> 10.1109/MDT.2005.150]
|
| |
5
|
|
 |
6
|
Jennifer M. Anderson , Lance M. Berc , Jeffrey Dean , Sanjay Ghemawat , Monika R. Henzinger , Shun-Tak A. Leung , Richard L. Sites , Mark T. Vandevoorde , Carl A. Waldspurger , William E. Weihl, Continuous profiling: where have all the cycles gone?, ACM Transactions on Computer Systems (TOCS), v.15 n.4, p.357-390, Nov. 1997
[doi> 10.1145/265924.265925]
|
| |
7
|
K. Banerjee, S-C. Lin, A. Keshavarzi, S. Narendra, and V. De. A Self-Consistent Junction Temperature Estimation Methodology for Nanometer scale ICs with Implications for Performance and Thermal Management. In IEEE International Electron Devices Meeting (IEDM), pages 887--890, 2003.
|
| |
8
|
Kaustav Banerjee, Shukri J. Souri, Pawan Kapur, and Krishna C. Saraswat. 3-d ics: A Novel Chip Design for Improving Deep Submicron Interconnect Performance and Systems-on-Chip Integration. Proceedings of the IEEE, 89(5):602--633, May 2001.
|
| |
9
|
Peter Benkart , Alexander Kaiser , Andreas Munding , Markus Bschorr , Hans-Joerg Pfleiderer , Erhard Kohn , Arne Heittmann , Holger Huebner , Ulrich Ramacher, 3D Chip Stack Technology Using Through-Chip Interconnects, IEEE Design & Test, v.22 n.6, p.512-518, November 2005
[doi> 10.1109/MDT.2005.125]
|
| |
10
|
|
 |
11
|
|
| |
12
|
Lawrence T. Clark, E.J. Hoffman, J. Miller, M. Biyani, Y. Liao, S. Strazdus, M. Morrow, K.E. Velarde, and M.A. Yarch. An embedded 32-b microprocessor core for low-power and highperformance applications. volume 36, pages 1599--1608, November 2001.
|
 |
13
|
Thomas M. Conte , Burzin A. Patel , J. Stan Cox, Using branch handling hardware to support profile-driven optimization, Proceedings of the 27th annual international symposium on Microarchitecture, p.12-21, November 30-December 02, 1994, San Jose, California, United States
[doi> 10.1145/192724.192726]
|
| |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
Digital Equipment Corporation. Alpha 21164 Microprocessor Hardware Reference Manual. 1995.
|
| |
18
|
Intel Corporation. Pentium(r) Pro Processor Developer's Manual. In McGraw-Hill, June 1997.
|
| |
19
|
|
| |
20
|
W. Rhett Davis , John Wilson , Stephen Mick , Jian Xu , Hao Hua , Christopher Mineo , Ambarish M. Sule , Michael Steer , Paul D. Franzon, Demystifying 3D ICs: The Pros and Cons of Going Vertical, IEEE Design & Test, v.22 n.6, p.498-510, November 2005
[doi> 10.1109/MDT.2005.136]
|
| |
21
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
22
|
J. Douglas and H.H. Rachford. On the numerical solution of heat conduction problems in two or three space variables. Transactions on American Mathematical Society, pages 421--439, 1956.
|
 |
23
|
|
| |
24
|
MIPS Technologies Inc. MIPS R10000 Microprocessor User's Manual. 1995.
|
| |
25
|
|
| |
26
|
Philip Jacob , Okan Erdogan , Aamir Zia , Paul M. Belemjian , Russell P. Kraft , John F. McDonald, Predicting the Performance of a 3D Processor-Memory Chip Stack, IEEE Design & Test, v.22 n.6, p.540-547, November 2005
[doi> 10.1109/MDT.2005.151]
|
| |
27
|
Trevor Jim , J. Greg Morrisett , Dan Grossman , Michael W. Hicks , James Cheney , Yanling Wang, Cyclone: A Safe Dialect of C, Proceedings of the General Track: 2002 USENIX Annual Technical Conference, p.275-288, June 10-15, 2002
|
| |
28
|
Michael B. Kleiner, Stefan A. Kühn, and Werner Weber. Performance improvement of the memory hierarchy of RISC systems by applications of 3-D technology. In ISCAS, pages 2305--2308, 1995.
|
 |
29
|
|
| |
30
|
|
 |
31
|
Gian Luca Loi , Banit Agrawal , Navin Srivastava , Sheng-Chih Lin , Timothy Sherwood , Kaustav Banerjee, A thermally-aware performance analysis of vertically integrated (3-D) processor-memory hierarchy, Proceedings of the 43rd annual conference on Design automation, July 24-28, 2006, San Francisco, CA, USA
[doi> 10.1145/1146909.1147160]
|
| |
32
|
|
| |
33
|
M. Mamidipaka and Nikil Dutt. eCACTI: An Enhanced Power Model for On-chip Caches. Technical Report CECS TR-04-28, September 2004.
|
| |
34
|
Claude Massit and Nicolas Gerard. Three-dimensional multichip module United States Patents, US 5373189, December 1994.
|
| |
35
|
Miura et al. A 195gb/s 1.2w 3D-stacked inductive inter-chip wireless superconnect with transmit power control scheme. In IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, pages 264--265, Feb 2005.
|
 |
36
|
|
| |
37
|
K. Narbos and J. White. Fastcap: A multipole accelerated 3D capacitance extraction program. IEEE Trans. on CAD, 10(11):1447--1459, 1991.
|
 |
38
|
|
| |
39
|
M.N. Ozisik. Boundary value problems of heat conduction, 2002.
|
| |
40
|
D.W. Peaceman and H.H. Rachford. The numerical solution of parabolic and elliptic differential equations. Journal of the Society for Industrial and Applied Mathematics (SIAM), pages 28--41, 1995.
|
| |
41
|
R.V. Peri, S. Jinturkar, and L. Fajardo. A Novel Technique for Profiling Programs in Embedded Systems. In ACM Workshop on Feedback-Directed and Dynamic Optimization, 1999.
|
| |
42
|
|
 |
43
|
Kevin Skadron , Mircea R. Stan , Wei Huang , Sivakumar Velusamy , Karthik Sankaranarayanan , David Tarjan, Temperature-aware microarchitecture, Proceedings of the 30th annual international symposium on Computer architecture, June 09-11, 2003, San Diego, California
|
 |
44
|
G. Edward Suh , Jae W. Lee , David Zhang , Srinivas Devadas, Secure program execution via dynamic information flow tracking, Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, October 07-13, 2004, Boston, MA, USA
|
| |
45
|
|
| |
46
|
|
 |
47
|
|
 |
48
|
|
 |
49
|
|
 |
50
|
|
| |
51
|
|
| |
52
|
|
CITED BY 7
|
|
Jongman Kim , Chrysostomos Nicopoulos , Dongkook Park , Reetuparna Das , Yuan Xie , Vijaykrishnan Narayanan , Mazin S. Yousif , Chita R. Das, A novel dimensionally-decomposed router for on-chip communication in 3D architectures, ACM SIGARCH Computer Architecture News, v.35 n.2, May 2007
|
|
|
Bryan Black , Murali Annavaram , Ned Brekelbaum , John DeVale , Lei Jiang , Gabriel H. Loh , Don McCaule , Pat Morrow , Donald W. Nelson , Daniel Pantuso , Paul Reed , Jeff Rupley , Sadasivan Shankar , John Shen , Clair Webb, Die Stacking (3D) Microarchitecture, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, p.469-479, December 09-13, 2006
|
|
|
|
|
|
|
|
|
Jing Li , Aditya Bansal , Swarop Ghosh , Kaushik Roy, An alternate design paradigm for low-power, low-cost, testable hybrid systems using scaled LTPS TFTs, ACM Journal on Emerging Technologies in Computing Systems (JETC), v.4 n.3, p.1-19, August 2008
|
|
|
Shashidhar Mysore , Banit Agrawal , Navin Srivastava , Sheng-Chih Lin , Kaustav Banerjee , Timothy Sherwood, 3D Integration for Introspection, IEEE Micro, v.27 n.1, p.77-83, January 2007
|
|
|
|
|