|
ABSTRACT
On-ine Transaction Processing (OLTP) workloads arecrucial benchmarks for the design and analysis of serverprocessors. Typical cached configurations used byresearchers to simulate OLTP workloads are orders ofmagnitude smaller than the fully scaled configurationsused by OEM vendors to achieve world-record transactionprocessing throughput. The objective of this study is todiscover the underlying relationships that characterizeOLTP performance over a wide range of configurations.To this end, we have derived the "iron law" of databaseperformance. Using our iron law, we show that both theaverage instructions executed per transaction (IPX) andthe average cycles per instruction (CPI) are critical to thetransaction-throughput performance. We use an extensive,empirical examination of an Oracle® based commercialOLTP workload on an Intel® XeonTM multiprocessorsystem to characterize the scaling behavior of both theIPX and the CPI. We demonstrate that across a widerange of configurations the IPX and CPI behavior followspredictable trends, which can be accurately characterizedby simple linear or piece-wise linear approximations.Based on our data,we propose a method for selecting aminimal, representative workload configuration fromwhich behaviors of much larger OLTP configurations canbe accurately extrapolated.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
[4] T. Diep, M. Annavaram, H. Nueckel, B. Hirano, and J.P. Shen. Analyzing Performance Characteristics of OLTP Cached Workloads by Linear Interpolation. In Proceedings of the 6th Workshop on Computer Architecture Evaluation using Commercial Workloads, pages 51-59, February 2003.
|
 |
5
|
|
 |
6
|
Luiz André Barroso , Kourosh Gharachorloo , Robert McNamara , Andreas Nowatzyk , Shaz Qadeer , Barton Sano , Scott Smith , Robert Stets , Ben Verghese, Piranha: a scalable architecture based on single-chip multiprocessing, Proceedings of the 27th annual international symposium on Computer architecture, p.282-293, June 2000, Vancouver, British Columbia, Canada
|
| |
7
|
[7] L.A. Barroso, K. Gharachorloo, A. Nowatzyk, and B. Verghese. Impact of Chip-Level Integration on Performance of OLTP Workloads. In Proceedings of the 6th International Symposium on High-Performance Computer Architecture, pages 3-14, January 2000.
|
 |
8
|
|
 |
9
|
Jack L. Lo , Luiz André Barroso , Susan J. Eggers , Kourosh Gharachorloo , Henry M. Levy , Sujay S. Parekh, An analysis of database workload performance on simultaneous multithreaded processors, Proceedings of the 25th annual international symposium on Computer architecture, p.39-50, June 27-July 02, 1998, Barcelona, Spain
|
| |
10
|
|
| |
11
|
[11] J. Kahle. Power4: A Dual-CPU Processor Chip. Microprocessor Forum '99, October 1999.
|
 |
12
|
Kimberly Keeton , David A. Patterson , Yong Qiang He , Roger C. Raphael , Walter E. Baker, Performance characterization of a Quad Pentium Pro SMP using OLTP workloads, Proceedings of the 25th annual international symposium on Computer architecture, p.15-26, June 27-July 02, 1998, Barcelona, Spain
|
| |
13
|
[13] J. Shen and M. Lipasti, Modern Processor Design: Fundamentals of Superscalar Processors, McGraw Hill, 2002.
|
| |
14
|
[14] The IA-32 Intel® Architecture Software Developer's Manual, Volume 3: System Programming Guide.
|
| |
15
|
[15] The Intel VTune Performance Analyzer. http://www.intel.com/software/products/vtune/.
|
| |
16
|
[16] The Intel Xeon Processor MP Product Overview. http://developer.intel.com/design/Xeon/xeonmp/prodb ref/.
|
 |
17
|
Kunle Olukotun , Basem A. Nayfeh , Lance Hammond , Ken Wilson , Kunyung Chang, The case for a single-chip multiprocessor, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.2-11, October 01-04, 1996, Cambridge, Massachusetts, United States
|
 |
18
|
Parthasarathy Ranganathan , Kourosh Gharachorloo , Sarita V. Adve , Luiz André Barroso, Performance of database workloads on shared-memory systems with out-of-order processors, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.307-318, October 02-07, 1998, San Jose, California, United States
|
 |
19
|
M. Rosenblum , E. Bugnion , S. A. Herrod , E. Witchel , A. Gupta, The impact of architectural trends on operating system performance, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.285-298, December 03-06, 1995, Copper Mountain, Colorado, United States
|
| |
20
|
[20] Standard Performance Council. The SPEC95 CPU Benchmark Suite. http://www.spec.org/cpu2000.
|
| |
21
|
[21] K. Keeton, D.A. Patterson. The impact of Hardware and Software Configuration on Computer Architecture Performance Evaluation. In the first Workshop on Computer Architecture Evaluation using Commercial Workloads.
|
| |
22
|
[22] R. Hankins, M. Annavaram, T. Diep, H. Eri, B. Hirano, H. Nueckel, and J.P. Shen. Comparing and Contrasting OLTP Workload Scaling on IA32 and IPF. October 2003. http://www.intel.com/research.
|
CITED BY 14
|
|
|
|
|
Minglong Shao , Anastassia Ailamaki , Babak Falsafi, DBmbench: fast and accurate database workload representation on modern microarchitecture, Proceedings of the 2005 conference of the Centre for Advanced Studies on Collaborative research, p.254-267, October 17-20, 2005, Toranto, Ontario, Canada
|
|
|
|
|
|
Gerolf Hoflehner , Knud Kirkegaard , Rod Skinner , Daniel Lavery , Yong-fong Lee , Wei Li, Compiler Optimizations for Transaction Processing Workloads on Itanium® Linux Systems, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.294-303, December 04-08, 2004, Portland, Oregon
|
|
|
Murali Annavaram , Ryan Rakvic , Marzia Polito , Jean-Yves Bouguet , Richard A. Hankins , Bob Davies, The Fuzzy Correlation between Code and Performance Predictability, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.93-104, December 04-08, 2004, Portland, Oregon
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomas F. Wenisch , Stephen Somogyi , Nikolaos Hardavellas , Jangwoo Kim , Anastassia Ailamaki , Babak Falsafi, Temporal Streaming of Shared Memory, ACM SIGARCH Computer Architecture News, v.33 n.2, p.222-233, May 2005
|
|
|
Bo Zhai , Ronald G. Dreslinski , David Blaauw , Trevor Mudge , Dennis Sylvester, Energy efficient near-threshold chip multi-processing, Proceedings of the 2007 international symposium on Low power electronics and design, August 27-29, 2007, Portland, OR, USA
|
|
|
|
|
|
Eric S. Chung , Michael K. Papamichael , Eriko Nurvitadhi , James C. Hoe , Ken Mai , Babak Falsafi, ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs, ACM Transactions on Reconfigurable Technology and Systems (TRETS), v.2 n.2, p.1-32, June 2009
|
|