ACM Home Page
Please provide us with feedback. Feedback
PicoServer: Using 3D stacking technology to build energy efficient servers
Full text PdfPdf (1.92 MB)
Source
ACM Journal on Emerging Technologies in Computing Systems (JETC) archive
Volume 4 ,  Issue 4  (October 2008) table of contents
Article No. 16  
Year of Publication: 2008
ISSN:1550-4832
Authors
Taeho Kgil  University of Michigan, Intel, Ann Arbor, MI
Ali Saidi  University of Michigan, Ann Arbor, MI
Nathan Binkert  HP Labs, Palo Alto, CA
Steve Reinhardt  University of Michigan, AMD, Ann Arbor, MI
Krisztian Flautner  ARM, Cambridge, UK
Trevor Mudge  University of Michigan, Ann Arbor, MI
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 287,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1412587.1412589
What is a DOI?

ABSTRACT

This article extends our prior work to show that a straightforward use of 3D stacking technology enables the design of compact energy-efficient servers. Our proposed architecture, called PicoServer, employs 3D technology to bond one die containing several simple, slow processing cores to multiple memory dies sufficient for a primary memory. The multiple memory dies are composed of DRAM. This use of 3D stacks readily facilitates wide low-latency buses between processors and memory. These remove the need for an L2 cache allowing its area to be re-allocated to additional simple cores. The additional cores allow the clock frequency to be lowered without impairing throughput. Lower clock frequency means that thermal constraints, a concern with 3D stacking, are easily satisfied. We extend our original analysis on PicoServer to include: (1) a wider set of server workloads, (2) the impact of multithreading, and (3) the on-chip DRAM architecture and system memory usage. PicoServer is intentionally simple, requiring only the simplest form of 3D technology where die are stacked on top of one another. Our intent is to minimize risk of introducing a new technology (3D) to implement a class of low-cost, low-power compact server architectures.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
3DRISC. 2004. FaStack 3D RISC super-8051 microcontroller. http://www.tachyonsemi.com/OtherICs/datasheets/TSCR8051Lx_1_5Web.pdf.
 
2
ARM11MPcore. 2004. ARM 11 MPcore. http://www.arm.com/products/CPUs/ARM11MPCoreMultiprocessor.html.
 
3
Banerjee, K., Souri, S. J., Kapur, P., and Saraswat, K. C. 2001. 3-D ICs: A novel chip design for improving deep-submicrometer interconnect performance and systems-on-chip integration. Proc. IEEE 89, 5 (May), 602--533.
4
 
5
 
6
 
7
 
8
Bryant, R., Hawkes, J., Steiner, J., Barnes, J., and Higdon, J. 2004. Scaling Linux to the extreme from 64 to 512 processors. In the Linux Symposium.
 
9
Chiang, T.-Y., Souri, S. J., Chui, C. O., and Saraswat, K. C. 2001. Thermal analysis of heterogeneous 3-D ICs with various integration scenario. In IEDM Tech. Digest, 681--684.
 
10
Clark, L. T., Hoffman, E. J., Miller, J., Biyani, M., Liao, Y., Strazdus, S., Morrow, M., Verlarde, K. E., and Yarch, M. A. 2001. An embedded 32-b microprocessor core for low-power and high-performance applications. IEEE J. Solid State Circ. 36, 11 (Nov.), 1599--1608.
 
11
Congduc, E. L. 2004. Packet classification in the NIC for improved SMP-based Internet servers. In Proceedings of the International Conference on Networking.
 
12
 
13
Flynn, M. J. and Hung, P. 2004. Computer architecture and technology: Some thoughts on the road ahead. In Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms, 3--16.
 
14
15
 
16
Gupta, S., Hilbert, M., Hong, S., and Patti, R. 2004. Techniques for producing 3D ICs with high-density interconnect. www.tezzaron.com/about/papers/ieee_vmic_2004_finalsecure.pdf.
 
17
Ho, R. and Horowitz, M. 2001. The future of wires. Proc. IEEE 89, 4 (Apr.).
18
 
19
ITRS 2005. ITRS roadmap. Tech. Rep.
 
20
21
22
 
23
 
24
Koyanagi, M. 2005. Different approaches to 3D chips. http://asia.stanford.edu/events/Spring05/slides/051205-Koyanagi.pdf.
 
25
Kunkel, S. R., Eickemeyer, R. J., Lipasti, M. H., Mullins, T. J., O'Krafka, B., Rosenberg, H., VanderWiel, S. P., Vitale, P. L., and Whitley, L. D. 2000. A performance methodology for commercial servers. IBM J. Res. Develop. 44, 6.
26
 
27
Lee, K., Nakamura, T., Ono, T., Yamada, Y., Mizukusa, T., Hashimoto, H., Park, K., Kurino, H., and Koyanagi, M. 2000. Three-Dimensional shared memory fabricated using wafer stacking technology. In IEDM Tech. Digest, 165--168.
28
29
 
30
LS3 2007. (LS)3-Libre streaming, Libre software, Libre standards an open multimedia streaming project. http://streaming.polito.it/.
 
31
Lu, J. 2005. Wafer-Level 3D hyper-integration technology platform. www.rpi.edu/~luj/RPI_3D_Research_0504.pdf.
 
32
MacGillivray, G. 2005. Process vs. density in DRAMs. http://www.eetasia.com/ARTICLES/2005SEP/B/2005SEP01_STOR_TA.pdf.
 
33
Maltz, D. A. and Bhagwat, P. 1998. TCP splicing for application layer proxy performance. Res. Rep. RC 21139, IBM. March.
 
34
 
35
MicronDRAM 2008. The Micron system-power calculator. http://www.micron.com/support/part_info/powercalc.
 
36
 
37
NetRAM. 2005. Evolution of network memory. http://www.jedex.org/images/pdf/jack_troung_samsung.pdf.
 
38
NSNIC 2001. National semiconductor DP83820 10 /100 /1000 Mb/s PCI ethernet network interface controller.
 
39
Ohsawa, T., Fujita, K., Hatsuda, K., Higashi, T., Shino, T., Minami, Y., Nakajima, H., Morikado, M., Inoh, K., Hamamoto, T., Watanabe, S., Fujii, S., and Furuyama, T. 2006. Design of a 128-Mb SOI DRAM Using the Floating Body Cell (FBC). IEEE J. Solid State Circ. 41, 1 (Jan).
 
40
OSDL. 2006. OSDL dataBase test suite. http://www.osdl.net/lab_activities/kernel_testing/osdl_database_test_suite/.
 
41
 
42
Ricci, F., Clark, L. T., Beatty, T., Yu, W., Bashmakov, A., Demmons, S., Fox, E., Miller, J., Biyani, M., and Haigh, J. 2005. A 1.5GHz 90nm embedded microprocessor core. In Proceedings of the Symposium on VLSI Circuits.
 
43
RLDRAM. 2008. RLDRAMA memory. http://www.micron.com/products/dram/rldram/.
 
44
Schutz, J. and Webb, C. 2004. A scalable X86 CPU design for 90 nm process. In Proceedings of the International Solid-State Circuits Conference.
 
45
Shah, M., Barreh, J., Brooks, J., Golla, R., Grohoski, G., Gura, N., Hetherington, R., Jordan, P., Luttrell, M., Olson, C., Saha, B., Sheahan, D., Spracklen, L., and Wynn, A. 2007. UltraSPARC T2: A highly-threaded, power-efficient, SPARC SOC. In Asian Solid-State Circuirts Conference.
 
46
SPECWeb. 1999. SPECweb99 benchmark. http://www.spec.org/osg/web99/.
 
47
SPECWeb. 2005. SPECweb2005 benchmark. http://www.spec.org/web2005/.
 
48
Sun Fire T2000. 2008. Sun Fire T2000 server power calculator. http://www.sun.com/servers/coolthreads/t2000/calc/index.jsp.
 
49
Wendell, D., Lin, J., Kaushik, P., Seshadri, S., Wang, A., Sundararaman, V., Wang, P., McIntyre, H., Kim, S., Hsu, W., Park, H., Levinsky, G., Lu, J., Chirania, M., Heald, R., and Lazar, P. 2004. A 4MB on-chip l2 cache for a 90nm 1.6GHz 64b SPARC microprocessor. In Proceedings of the International Solid-State Circuits Conference.
 
50
Xue, L., Liu, C. C., Kim, H.-S., Kim, S., and Tiwari, S. 2003. Three-Dimensional integration: Technology, use, and issues for mixed-signal applications. IEEE Trans. Electron Devices 50, 601--609.

Collaborative Colleagues:
Taeho Kgil: colleagues
Ali Saidi: colleagues
Nathan Binkert: colleagues
Steve Reinhardt: colleagues
Krisztian Flautner: colleagues
Trevor Mudge: colleagues