|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
The trend towards increasing the number of processor cores and cache capacity in future Chip-Multiprocessors (CMPs), will require scalable packet-switched interconnection networks adapted to the restrictions imposed by the CMP environment. This paper presents an innovative router design, which successfully addresses CMP cost/performance constraints. The router structure is based on two independent rings, which force packets to circulate either clockwise or anti-clockwise, traveling through every port of the router. It uses a completely decentralized scheduling scheme, which allows the design to: (1) take advantage of wide links, (2) reduce Head of Line blocking, (3) use adaptive routing, (4) be topology agnostic, (5) scale with network degree, and (6) have reasonable power consumption and implementation cost. A thorough comparative performance analysis against competitive conventional routers shows an advantage for our proposal of up to 50 % in terms of raw performance and nearly 60 % in terms of energy-delay product.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
N.R. Adiga, et al., "An Overview of the BlueGene/L Supercomputer", Supercomputing 2002.
|
 |
2
|
|
| |
3
|
S. Borkar, et al. "Platform 2015: Intel Platform and Evolution for the Next Decade", Technology@Intel Magazine, March 2005.
|
| |
4
|
Doug Burger , Stephen W. Keckler , Kathryn S. McKinley , Mike Dahlin , Lizy K. John , Calvin Lin , Charles R. Moore , James Burrill , Robert G. McDonald , William Yoder , the TRIPS Team, Scaling to the End of Silicon with EDGE Architectures, Computer, v.37 n.7, p.44-55, July 2004
[doi> 10.1109/MC.2004.65]
|
 |
5
|
|
| |
6
|
|
| |
7
|
R. Gonzalez, M. Horowitz, "Energy Dissipation In General Purpose Microprocessors", IEEE Journal of Solid-State Circuits, Vol. 31, No. 9, pp. 1277--1284, September 1996.
|
| |
8
|
P. Gratz, C. Kim, R. McDonald, S. W. Keckler, D. Burger, "Implementation and Evaluation of On-Chip Network Architectures", International Conference on Computer Design (ICCD), 2006.
|
| |
9
|
M. J. Karol, M. G. Hluchyj, S. P. Morgan, "Input versus Output queuing on a space-division packet switch", IEEE Trans. Communication., Vol. 35, no. 12, pp. 1347--1356, December 1987.
|
 |
10
|
Manolis Katevenis , Panagiota Vatsolaki , Aristides Efthymiou, Pipelined memory shared buffer for VLSI switches, Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, p.39-48, August 28-September 01, 1995, Cambridge, Massachusetts, United States
|
| |
11
|
P. Kermani, L. Kleinrock, "Virtual Cut-Through: A New Computer Communication Switching Technique". Computer Networks, Vol. 3, pp. 267--286, September 1979.
|
 |
12
|
|
 |
13
|
|
 |
14
|
Jongman Kim , Dongkook Park , T. Theocharides , N. Vijaykrishnan , Chita R. Das, A low latency router supporting adaptivity for on-chip interconnects, Proceedings of the 42nd annual conference on Design automation, June 13-17, 2005, Anaheim, California, USA
[doi> 10.1145/1065579.1065726]
|
| |
15
|
|
| |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
Peter S. Magnusson , Magnus Christensson , Jesper Eskilson , Daniel Forsgren , Gustav Hållberg , Johan Högberg , Fredrik Larsson , Andreas Moestedt , Bengt Werner, Simics: A Full System Simulation Platform, Computer, v.35 n.2, p.50-58, February 2002
[doi> 10.1109/2.982916]
|
 |
20
|
Milo M. K. Martin , Daniel J. Sorin , Bradford M. Beckmann , Michael R. Marty , Min Xu , Alaa R. Alameldeen , Kevin E. Moore , Mark D. Hill , David A. Wood, Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset, ACM SIGARCH Computer Architecture News, v.33 n.4, November 2005
[doi> 10.1145/1105734.1105747]
|
| |
21
|
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
V. Puente , C. Izu , R. Beivide , J. A. Gregorio , F. Vallejo , J. M. Prellezo, The adaptive bubble router, Journal of Parallel and Distributed Computing, v.61 n.9, p.1180-1208, September 2001
[doi> 10.1006/jpdc.2001.1746]
|
| |
28
|
V. Puente , R. Beivide , J. A. Gregorio , J. M. Prellezo , J. Duato , C. Izu, Adaptive Bubble Router: A Design to Improve Performance in Torus Networks, Proceedings of the 1999 International Conference on Parallel Processing, p.58, September 21-24, 1999
|
| |
29
|
V. Puente, J.A. Gregorio, R. Beivide, "SICOSYS: An Integrated Framework for studying Interconnection Network in Multiprocessor Systems", Euromicro Workshop on Parallel and Distributed Processing, 2002.
|
| |
30
|
Karthikeyan Sankaralingam , Ramadass Nagarajan , Robert McDonald , Rajagopalan Desikan , Saurabh Drolia , M. S. Govindan , Paul Gratz , Divya Gulati , Heather Hanson , Changkyu Kim , Haiming Liu , Nitya Ranganathan , Simha Sethumadhavan , Sadia Sharif , Premkishore Shivakumar , Stephen W. Keckler , Doug Burger, Distributed Microarchitectural Protocols in the TRIPS Prototype Processor, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, p.480-491, December 09-13, 2006
[doi> 10.1109/MICRO.2006.19]
|
| |
31
|
|
| |
32
|
|
| |
33
|
|
| |
34
|
S. J. E. Wilton and N. P. Jouppi. "CACTI: An Enhanced Cache Access and Cycle Time Model", IEEE Journal of Solid-State Circuits, May 1996, pp 677--688.
|
| |
35
|
|
CITED BY 6
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Radu Marculescu , Umit Y. Ogras , Li-Shiuan Peh , Natalie Enright Jerger , Yatin Hoskote, Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, v.28 n.1, p.3-21, January 2009
|
|
|
|
|