|
ABSTRACT
As the number of cores on a chip increases, power consumed by the communication structures takes a significant portion of the overall power budget. In this article, we first propose a circuit-switched interconnection architecture which uses crossroad switches to construct dedicated channels dynamically between any pairs of cores for nonhuge application-specific SoCs. The structure of the crossroad switch is simple, which can be regarded as a NoC-lite router, and we can easily construct a low-power on-chip network with these switches by a system-level design methodology. We also present the design methodology to tailor the proposed interconnection architecture to low-power structures by two proposed optimization schemes with profiled communication characteristics. The first scheme is power-aware topology construction, which can build low-power application-specific interconnection topologies. To further reduce the power consumption, we propose the second optimization scheme to predetermine the operating mode of dual-mode switches in the NoC at runtime. We evaluate several interconnection techniques, and the results show that the proposed architecture is more low-power and high-performance than others under some constraints and scale boundaries. We take multimedia applications as case studies, and experimental results show the power savings of power-aware topology approximate to 49% of the interconnection architecture. The power consumption can be further reduced approximately 25% by applying partially dedicated path mechanism.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
Davide Bertozzi , Antoine Jalabert , Srinivasan Murali , Rutuparna Tamhankar , Stergios Stergiou , Luca Benini , Giovanni De Micheli, NoC Synthesis Flow for Customized Domain Specific Multiprocessor Systems-on-Chip, IEEE Transactions on Parallel and Distributed Systems, v.16 n.2, p.113-129, February 2005
[doi> 10.1109/TPDS.2005.22]
|
| |
5
|
Burger, D. and Austin, T. M. 1997. The simplescalar tool set, version 2.0. http://www.simplescalar.com.
|
 |
6
|
|
 |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
der Tol, E. V. and Jaspers, E. 2002. Mapping of mepg-4 decoding on a flexible architecture platform. In SPIE. 1--13.
|
 |
13
|
Krisztián Flautner , Nam Sung Kim , Steve Martin , David Blaauw , Trevor Mudge, Drowsy caches: simple techniques for reducing leakage power, Proceedings of the 29th annual international symposium on Computer architecture, May 25-29, 2002, Anchorage, Alaska
|
| |
14
|
|
| |
15
|
|
| |
16
|
Gaughan, P. T. and Yalamanchili, S. 1992. Pipelined circuit-switching: A faulttolerant variant of wormhole routing. In Proceedings of the 4th IEEE International Symposium on Parallel and Distributed Processing. 148--155.
|
| |
17
|
Gomory, R. E. and Hu, T. C. 1961. Multi-terminal network flows. J. SIAM 9, 4 (Dec.), 551--569.
|
| |
18
|
Hsieh, C.-T. and Pedram, M. 2002. Architectural energy optimizatino by bus spliting. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 21, 4, 408--414.
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
Jaspers, E. and de With, P. 1999. Chip-set for video display of multimedia information. IEEE Trans. Consum. Electr. 45, 3 (Aug.), 707--716.
|
 |
24
|
|
| |
25
|
Kim, K., Lee, S.-J., and Yoo, H.-J. 2005. An arbitration look-ahead scheme for reducing end-to-end latency in networks-on-chi. In Proceedings of the IEEE International Symposium on Circuits and Systems. 2375--2360.
|
| |
26
|
Koziris, N. 2000. An effieient algorithm for the physical mapping of clustered task graphs onto multiprocessor architectures. In Proceedings of the 8th Euromicro Conference on Parallel, Distributed and Network-Based Processing. 406--413.
|
| |
27
|
Lee, K., Lee, S.-J., and Yoo, H.-J. 2003. A high-speed and lightweight on-chip crossbar switch scheduler for on-chip interconnection networks. In Proceedings of ESSCIRC, Digest of Technical Papers. 453--456.
|
| |
28
|
|
| |
29
|
|
| |
30
|
Lee, S.-J., Lee, K., and Yoo, H.-J. 2005b. Packet-switched on-chip interconnection network for system-on-chip applications. IEEE Trans. Circ. Syst. 52, 6, 308--312.
|
| |
31
|
Lee, S.-J., Song, S.-J., Lee, K., Woo, J.-H., Kim, S.-E., Nam, B.-G., and Yoo, H.-J. 2003. A 800mhz star-connected on-chip network for application to systems on a chip. ISSCC Digest of Technology Papers, 468--469.
|
| |
32
|
Lo, V. 1991. Oregami: Tools for mapping parallel computations to parallel architectures. J. Parel. Program. 20, 3, 237--270.
|
| |
33
|
|
 |
34
|
|
| |
35
|
Nakagome, Y., Itoh, K., Isoda, M., Takeuchi, K., and Aoki, M. 1993. Sub-1-v swing internal bus architecture for future low-power ulsis. IEEE J. Solid-State Circ. 28, 414--419.
|
| |
36
|
|
| |
37
|
Partha Pratim Pande , Cristian Grecu , Andre Ivanov , Resve Saleh , Giovanni De Micheli, Design, Synthesis, and Test of Networks on Chips, IEEE Design & Test, v.22 n.5, p.404-413, September 2005
[doi> 10.1109/MDT.2005.108]
|
| |
38
|
Partnership, O. I. OCP specification. http://www.ocpip.org/socket/ocpspec/.
|
| |
39
|
|
| |
40
|
Plosila, J., Seceleanu, T., and Liljeberg, P. 2003. Implementation of a self-timed segmented bus. IEEE J. Design Test Comput. 20, 6, 44--50.
|
 |
41
|
Michael Powell , Se-Hyun Yang , Babak Falsafi , Kaushik Roy , T. N. Vijaykumar, Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories, Proceedings of the 2000 international symposium on Low power electronics and design, p.90-95, July 25-27, 2000, Rapallo, Italy
[doi> 10.1145/344166.344526]
|
 |
42
|
|
| |
43
|
Shen, J.-S., Chang, K.-C., and Chen, T.-F. 2006. On a design of crossroad switches for low-power on-chip communication architectures. In Proceedings of the International Symposium on Circuits and Systems (ISCAS).
|
| |
44
|
Silicore Corporation 2001. WISHBONE System-On-Chip Interconnection Architecture for Portable IP Cores. Silicore Corporation.
|
| |
45
|
|
 |
46
|
|
| |
47
|
|
|