ACM Home Page
Please provide us with feedback. Feedback
Piranha: a scalable architecture based on single-chip multiprocessing
Full text PdfPdf (191 KB)
Blog Information The Observation Deck: Concurrency's Shysters
Bryan McDowell Cantrill (11/04/2008)
Source International Symposium on Computer Architecture archive
Proceedings of the 27th annual international symposium on Computer architecture table of contents
Vancouver, British Columbia, Canada
Pages: 282 - 293  
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
Authors
Luiz André Barroso  Western Research Laboratory, Compaq Computer Corporation, Palo Alto, CA
Kourosh Gharachorloo  Western Research Laboratory, Compaq Computer Corporation, Palo Alto, CA
Robert McNamara  Systems Research Center, Compaq Computer Corporation, Palo Alto, CA
Andreas Nowatzyk  Western Research Laboratory, Compaq Computer Corporation, Palo Alto, CA
Shaz Qadeer  Systems Research Center, Compaq Computer Corporation, Palo Alto, CA
Barton Sano  Western Research Laboratory, Compaq Computer Corporation, Palo Alto, CA
Scott Smith  NonStop Hardware Development, Compaq Computer Corporation, Austin, TX
Robert Stets  Western Research Laboratory, Compaq Computer Corporation, Palo Alto, CA
Ben Verghese  Western Research Laboratory, Compaq Computer Corporation, Palo Alto, CA
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 51,   Downloads (12 Months): 437,   Citation Count: 110
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/339647.339696
What is a DOI?

ABSTRACT

The microprocessor industry is currently struggling with higher development costs and longer design times that arise from exceedingly complex processors that are pushing the limits of instruction-level parallelism. Meanwhile, such designs are especially ill suited for important commercial applications, such as on-line transaction processing (OLTP), which suffer from large memory stall times and exhibit little instruction-level parallelism. Given that commercial applications constitute by far the most important market for high-performance servers, the above trends emphasize the need to consider alternative processor designs that specifically target such workloads. The abundance of explicit thread-level parallelism in commercial workloads, along with advances in semiconductor integration density, identify chip multiprocessing (CMP) as potentially the most promising approach for designing processors targeted at commercial servers. This paper describes the Piranha system, a research prototype being developed at Compaq that aggressively exploits chip multi-processing by integrating eight simple Alpha processor cores along with a two-level cache hierarchy onto a single chip. Piranha also integrates further on-chip functionality to allow for scalable multiprocessor configurations to be built in a glueless and modular fashion. The use of simple processor cores combined with an industry-standard ASIC design methodology allow us to complete our prototype within a short time-frame, with a team size and investment that are an order of magnitude smaller than that of a commercial microprocessor. Our detailed simulation results show that while each Piranha processor core is substantially slower than an aggressive next-generation processor, the integration of eight cores onto a single chip allows Piranha to outperform next-generation processors by up to 2.9 times (on a per chip basis) on important workloads such as OLTP. This performance advantage can approach a factor of five by using full-custom instead of ASIC logic. In addition to exploiting chip multiprocessing, the Piranha prototype incorporates several other unique design choices including a shared second-level cache with no inclusion, a highly optimized cache coherence protocol, and a novel I/O architecture.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
P. Bannon. Alpha 21364: A Scalable Single-chip SMP. Presented at the Microprocessor Forum '98 (http://www.digital.com/alphaoem/microprocessorforum.htm), October 1998.
 
3
L.A. Barroso, K. Gharachorloo, A. Nowatzyk, and B. Verghese. Impact of Chip-Level Integration on Performance of OLTP Workloads. In 6th International Symposium on High-Performance Computer Architecture, pages 3-14, January 2000.
4
 
5
J. Borkenhagen and S. Storino. 5th Generation 64-bit PowerPC-Compatible Commercial Processor Design. http://www.rs6OOO.ibm.com /resource/technology/pulsar.pdf. September 1999.
 
6
S. Crowder et al. IEDM Technical Digest, page 1017, 1998.
7
 
8
Z. Cvetanovic and D. Donaldson. AlphaServer 4100 Performance Characterization. In Digital Technical Journal, 8(4), pages 3-20, 1996.
 
9
K. Diefendorff. Power4 Focuses on Memory Bandwidth: IBM Confronts IA-64, Says ISA Not Important. In Microprocessor Report, Vol. 13, No. 13, October 1999.
 
10
Digital Equipment Corporation. Digital Semiconductor 21164 Alpha Microprocessor Hardware Reference Manual. March 1996.
 
11
12
 
13
J.S. Emer. Simultaneous Multithreading: Multiplying Alpha's Performance. Presentation at the Microprocessor Forum '99, October 1999.
 
14
A. Gupta, W.-D. Weber, and T. Mowry. Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes. In International Conference on Parallel Processing, July 1990.
 
15
16
 
17
L. Hammond, B. Hubbert, M. Siu, M. Prabhu, M. Willey, M. Chen, M. Kozyrczak, and K. Olukotun. The Stanford Hydra CMP. Presented at Hot Chips 11, August 1999.
 
18
 
19
IBM Microelectronics. ASIC SA27E Databook. International Business Machines, 1999.
20
21
22
 
23
24
25
26
27
28
29
30
 
31
A. Nowatzyk, G. Aybay, M. Browne, E. Kelly, M. Parkin, W. Radke, and S. Vishin. The S3.mp Scalable Shared Memory Multiprocessor. In International Conference on Parallel Processing (ICPP' 95), pages 1.1 - 1.10, July 1995.
 
32
33
34
35
36
37
38
 
39
 
40
Standard Performance Council. The SPEC95 CPU Benchmark Suite. http ://www.specbench.org, 1995.
 
41
42
 
43
Transaction Processing Performance Council. TPC Benchmark B Standard Specification Revision 2.0. June 1994.
 
44
Transaction Processing Performance Council. TPC Benchmark D (Decision Support) Standard Specification Revision 1.2. November 1996.
 
45
Transaction Processing Performance Council. TPC Benchmark C, Standard Specification Revision 3.6, October 1999.
 
46
 
47
M. Tremblay. MAJC-5200: A VLIW Convergent MPSOC. In Microprocessor Forum, October 1999.
48

CITED BY  110

Collaborative Colleagues:
Luiz André Barroso: colleagues
Kourosh Gharachorloo: colleagues
Robert McNamara: colleagues
Andreas Nowatzyk: colleagues
Shaz Qadeer: colleagues
Barton Sano: colleagues
Scott Smith: colleagues
Robert Stets: colleagues
Ben Verghese: colleagues