|
ABSTRACT
It is widely accepted that the disproportionate scalingof transistor and conventional on-chip interconnect performancepresents a major barrier to future high performancesystems. Previous research has focused on wire-centricdesigns that use parallelism, locality, and on-chipwiring bandwidth to compensate for long wire latency.An alternative approach to this problem is to exploitnewly-emerging on-chip transmission line technology toreduce communication latency. Compared to conventionalRC wires, transmission lines can reduce delay by up to afactor of 30 for global wires, while eliminating the needfor repeaters. However, this latency reduction comes at thecost of a comparable reduction in bandwidth.In this paper, we investigate using transmission linesto access large level-2 on-chip caches. We propose a familyof Transmission Line Cache (TLC) designs that representdifferent points in the latency/bandwidth spectrum.Compared to the recently-proposed Dynamic Non-UniformCache Architecture (DNUCA) design, the base TLCdesign reduces the required cache area by 18% andreduces the interconnection network's dynamic powerconsumption by an average of 61%. The optimized TLCdesigns attain similar performance using fewer transmis-sionlines but with some additional complexity. Simulationresults using full-system simulation show that TLC providesmore consistent performance than the DNUCAdesign across a wide variety of workloads. TLC caches arelogically simpler than DNUCA designs, but requiregreater circuit and manufacturing complexity.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
[1] V. Agarwal, S. W. Keckler, and D. Burger. The Effect of Technology Scaling on Microarchitectural Structures. Technical Report TR-00-02, Department of Computer Sciences, University of Texas at Austin, May 2001.
|
| |
2
|
Alaa R. Alameldeen , Milo M. K. Martin , Carl J. Mauer , Kevin E. Moore , Min Xu , Mark D. Hill , David A. Wood , Daniel J. Sorin, Simulating a $2M Commercial Server on a $2K PC, Computer, v.36 n.2, p.50-57, February 2003
[doi> 10.1109/MC.2003.1178046]
|
| |
3
|
[3] B. S. Amrutur and M. A. Horowitz. Speed and Power Scaling of SRAMs. IEEE Transactions on Solid-State Circuits, 35(2):175- 185, Feb. 2000.
|
| |
4
|
[4] H. Bao, J. Bielak, O. Ghattas, L. F. Kallivokas, D. R. O'Hallaron, J. R. Shewchuk, and J. Xu. Large-scale simulation of elastic wave propagation in heterogeneous media on parallel computers. Computer Methods in Applied Mechanics and Engineering, pages 85-102, 1998.
|
 |
5
|
|
| |
6
|
[6] B. J. Benschneider and et. al. A 300-MHz 64-b Quad-Issue CMOS RISC Microprocessor. IEEE Journal of Solid-State Circuits, 30(11):1203-1214, Nov. 1995.
|
| |
7
|
|
| |
8
|
[8] R. T. Chang, N. Talwalkar, C. P. Yue, and S. S. Wong. Near Speed-of-Light Signaling Over On-Chip Electrical Interconnects. IEEE Journal of Solid-State Circuits, 38(5):834-838, May 2003.
|
| |
9
|
[9] C. T. Chaung. Design Considerations of SOI Digital CMOS. In Proceedings of the IEEE 1998 International SOI Conference, pages 5-8, 1998.
|
| |
10
|
|
| |
11
|
[11] A. Deutsch. Electrical Characteristics of Interconnections for High-Performance Systems. Proceedings of the IEEE, 86(2):315-355, Feb. 1998.
|
| |
12
|
[12] A. R. Djordjevic, M. B. Bazdar, T. K. Sarkar, and R. F. Harrington. Matrix Parameters for Multiconductor Transmission Lines: Software and User's Manual. Artech House, 1989.
|
| |
13
|
[13] I. T. R. for Semiconductors. ITRS 1999 Edition. Semiconductor Industry Association, 1999.
|
| |
14
|
[14] I. T. R. for Semiconductors. ITRS 2002 Update. Semiconductor Industry Association, 2002. http://public.itrs.net/Files/2002Update/2002Update.pdf.
|
| |
15
|
|
| |
16
|
[16] G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, Feb. 2001.
|
| |
17
|
[17] R. Ho, K. W. Mai, and M. A. Horowitz. The Future of Wires. Proceedings of the IEEE, 89(4):490-504, Apr. 2001.
|
 |
18
|
M. S. Hrishikesh , Doug Burger , Norman P. Jouppi , Stephen W. Keckler , Keith I. Farkas , Premkishore Shivakumar, The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays, Proceedings of the 29th annual international symposium on Computer architecture, May 25-29, 2002, Anchorage, Alaska
|
| |
19
|
[19] S. Kempainen. LVDS Provides Higher Bit Rates, Lower Power, and Improved Noise Performance. http://www.measurement.tm.agilent.com/insight/2000_v5_i2/insig ht_v5i2_articl%e01.shtml, 2000.
|
| |
20
|
|
 |
21
|
|
 |
22
|
Sunil P. Khatri , Amit Mehrotra , Robert K. Brayton , Ralf H. J. M. Otten , Alberto Sangiovanni-Vincentelli, A novel VLSI layout fabric for deep sub-micron applications, Proceedings of the 36th ACM/IEEE conference on Design automation, p.491-496, June 21-25, 1999, New Orleans, Louisiana, United States
[doi> 10.1145/309847.309985]
|
| |
23
|
[23] C. Kim. Personal Communication, May 2003.
|
 |
24
|
|
| |
25
|
[25] G. K. Konstadinidis and et. al. Implementation of a Third-Generation 1.1-GHz 64-bit Microprocessor. IEEE Journal of Solid-State Circuits, 37(11):1461-1469, Nov. 2002.
|
| |
26
|
Peter S. Magnusson , Magnus Christensson , Jesper Eskilson , Daniel Forsgren , Gustav Hållberg , Johan Högberg , Fredrik Larsson , Andreas Moestedt , Bengt Werner, Simics: A Full System Simulation Platform, Computer, v.35 n.2, p.50-58, February 2002
[doi> 10.1109/2.982916]
|
 |
27
|
|
| |
28
|
|
| |
29
|
[29] M. Minzuno, K. Anjo, Y. Sumi, M. Fukaishi, H. Wakabayashi, T. Mogami, T. Horiuchi, and M. Yamashina. Clock Distribution Networks with On-Chip Transmission Lines. In Proceedings of the IEEE 2000 International Interconnect Technology Conference, pages 3-5, 2000.
|
| |
30
|
|
 |
31
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
| |
32
|
[32] D. A. Priore. Inductance on Silicon for Sub-micron CMOS VLSI. In Proceedings of the 1993 Symposium on VLSI Circuits, pages 17-18, 1993.
|
| |
33
|
[33] M. Racanelli and et. al. Ultra High Speed SiGe NPN for Advanced BiCMOS Technology. Electron Devices Meeting, IEDM Technical Digest. International, pages 15.3.1-15.3.4, 2001.
|
| |
34
|
[34] D. Sylvester, W. Jiang, and K. Keutzer. BACPAC - Berkeley Advanced Chip Performance Calculator website. http://www-device.eecs.berkeley.edu/dennis/bacpac/.
|
 |
35
|
|
| |
36
|
[36] Systems Performance Evaluation Cooperation. SPEC Benchmarks. http://www.spec.org.
|
| |
37
|
[37] J. M. Tendler, S. Dodson, S. Fields, H. Le, and B. Sinharoy. POWER4 System Microarchitecture. IBM Server Group Whitepaper, Oct. 2001.
|
| |
38
|
[38] F. F. Tsui. JSP - A Research Signal Processor in Josephson Technology. IBM Journal of Research and Development, 24(2):243-252, Mar. 1980.
|
| |
39
|
|
| |
40
|
[40] J. D. Warnock and et. al. The Circuit and Physical Design of the POWER4 Microprocessor. IBM Journal of Research and Development, 46(1):27-51, Jan. 2002.
|
| |
41
|
|
| |
42
|
[42] C.-Y. Wu and M.-C. Shiau. Delay Models and Speed Improvement Techniques for RC Tree Interconnections Among Small-Geometry CMOS Inverters. IEEE Journal of Solid-State Circuits, 25(5):1247- 1256, Oct. 1990.
|
| |
43
|
[43] T. Xanthopoulos, D. W. Bailey, M. K. G. Atul K. Gangwar, A. K. Jain, and B. K. Prewitt. The Design and Analysis of the Clock Distribution Network for a 1.2 GHz Alpha Microprocessor. In Proceedings of the IEEE 2001 International Solid-State Circuits Conference, pages 402-403, 2001.
|
CITED BY 11
|
|
J. Balachandran , S. Brebels , G. Carchon , T. Webers , W. De Raedt , B. Nauwelaers , E. Beyne, Package level interconnect options, Proceedings of the 2005 international workshop on System level interconnect prediction, April 02-03, 2005, San Francisco, California, USA
|
|
|
J. Balachandran , S. Brebels , G. Carchon , M. Kuijk , W. De Raedt , B. Nauwelaers , E. Beyne, Constant impedance scaling paradigm for interconnect synthesis, Proceedings of the international workshop on System-level interconnect prediction, March 04-05, 2006, Munich, Germany
|
|
|
David I. August , Sharad Malik , Li-Shiuan Peh , Vijay Pai , Manish Vachharajani , Paul Willmann, Achieving structural and composable modeling of complex systems, International Journal of Parallel Programming, v.33 n.2, p.81-101, June 2005
|
|
|
|
|
|
|
|
|
J. Balachandran , S. Brebels , G. Carchon , T. Webers , W. De Raedt , B. Nauwelaers , E. Beyne, Analysis and modeling of power grid transmission lines, Proceedings of the conference on Design, automation and test in Europe: Proceedings, March 06-10, 2006, Munich, Germany
|
|
|
|
|
|
|
|
|
J. Balachandran , S. Brebels , G. Carchon , W. De Raedt , E. Beyne , M. Kuijk , B. Nauwelaers, Constant Impedance Scaling Paradigm for Scaling LC transmission lines, Proceedings of the 7th International Symposium on Quality Electronic Design, p.387-392, March 27-29, 2006
|
|
|
|
|
|
M-C. Frank Chang , Jason Cong , Adam Kaplan , Chunyue Liu , Mishali Naik , Jagannath Premkumar , Glenn Reinman , Eran Socher , Sai-Wang Tam, Power reduction of CMP communication networks via RF-interconnects, Proceedings of the 2008 41st IEEE/ACM International Symposium on Microarchitecture, p.376-387, November 08-12, 2008
|
|