|
ABSTRACT
This paper evaluates performance characteristics of the Compaq ES40 shared memory multiprocessor. The ES40 system contains up to four Alpha 21264 CPU's together with a high-performance memory system. We qualitatively describe architectural features included in the 21264 microprocessor and the surrounding system chipset. We further quantitatively show the performance effects of these features using benchmark results and profiling data collected from industry-standard commercial and technical workloads. The profile data includes basic performance information - such as instructions per cycle, branch mispredicts, and cache misses - as well as other data that specifically characterizes the 21264. Wherever possible, we compare and contrast the ES40 to the AlphaServer 4100 - a previous-generation Alpha system containing four Alpha 21164 microprocessors - to highlight the architectural advances in the ES40. We find that the Compaq ES40 often provides 2 to 3 times the performance of the AlphaServer 4100 at similar clock frequencies. We also find that the ES40 memory system has about five times the memory bandwidth of the 4100. These performance improvements come from numerous microprocessor and platform enhancements, including out-of-order execution, branch prediction, functional units, and the memory system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
DCPI and ProfileMe External Release page: http://www.research.digital.com/SRC/dcpi/release.html
|
| |
5
|
Jeffrey Dean , James E. Hicks , Carl A. Waldspurger , William E. Weihl , George Chrysos, ProfileMe: hardware support for instruction-level profiling on out-of-order processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.292-302, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
6
|
|
| |
7
|
|
| |
8
|
Z. Cvetanovic, and D. D. Donaldson, "AlphaServer 4100 Performance Characterization, Digital Technical Journal, Vol 8 No. 4, 1996: pp. 3-20.
|
| |
9
|
Information about the Transaction Processing Performance Council (TPC) is available at http://www.tpc.org
|
| |
10
|
SPEC95 Benchmarks (Manassas, Va.: Standard Performance Evaluation Corporation, 1995) information available at http://www.specbench.org/osg/cpu95/results/
|
| |
11
|
Information about the lmbench available at http://reality.sgi.com/employees/lm_engr/lmbench/whatis lmbench.html
|
| |
12
|
The STREAM benchmark information available from the University of Virginia, Department of Computer Science (Charlottesville, Va.) at http://www.cs.viginia.edu/stream
|
| |
13
|
"Compiler Writer's Guide for the Alpha 21264", Compaq Computer, June 1999, Order Number EC-RJ66A-TE, ftp://ftp.digital.com/pub/Digital/info/semiconductor/litera ture/dsc-library.html
|
|