ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
I/O reference behavior of production database workloads and the TPC benchmarks—an analysis at the logical level
Full text PdfPdf (5.42 MB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 26 ,  Issue 1  (March 2001) table of contents
Pages: 96 - 143  
Year of Publication: 2001
ISSN:0362-5915
Authors
Windsor W. Hsu  Univ. of California, Berkeley, IBM Almaden Research Center, San Jose, CA
Alan Jay Smith  Univ. of California, Berkeley, IBM Almaden Research Center, San Jose, CA
Honesty C. Young  IBM Almaden Research Center, San Jose, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 146,   Citation Count: 10
Additional Information:

abstract   references   cited by   additional resources   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/383734.383737
What is a DOI?

Warning: The download time has expired please click on the item to try again.


ABSTRACT

As improvements in processor performance continue to far outpace improvements in storage performance, I/O is increasingly the bottleneck in computer systems, especially in large database systems that manage huge amoungs of data. The key to achieving good I/O performance is to thoroughly understand its characteristics. In this article we present a comprehensive analysis of the logical I/O reference behavior of the peak productiondatabase workloads from ten of the world's largest corporations. In particular, we focus on how these workloads respond to different techniques for caching, prefetching, and write buffering. Our findings include several broadly applicable rules of thumb that describe how effective the various I/O optimization techniques are for the production workloads. For instance, our results indicate that the buffer pool miss ratio tends to be related to the ratio of buffer pool size to data size by an inverse square root rule. A similar fourth root rule relates the write miss ratio and the ration of buffer pool size to data size. In addition, we characterize the reference characteristics of workloads similar to the Transaction Processing Performance Council (TPC) benchmarks C (TPC-C) and D(TPC-D), which are de facto standard performance measures for online transaction processing (OLTP) systems and decision support systems (DSS), respectively. Since benchmarks such as TPC-C and TPC-D can only be used effectively if their strengths and limitations are understood, a major focus of our analysis is to identify aspects of the benchmarks that stress the system differently than the production workloads. We discover that for the most part, the reference behavior of TPC-C and TPC-D fall within the range of behavior exhibited by the production workloads. However, there are some noteworthy exceptions that affect well-known I/O optimization techniques such as caching (LRU is further from the optimal for TPC-C, while there is little sharing of pages between transactions for TPC-D), prefetching (TPC-C exhibits no significant sequentiality), and write buffering (write buffering is lees effective for the TPC benchmarks). While the two TPC benchmarks generally complement one another in reflecting the characteristics of the production workloads, there remain aspects of the real workloads that are not represented by either of the benchmarks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
ATUL, C., DONALD,H.J.,SHIBAMIYA, A., LYLE,R.W.,AND WATTS, S. J. 1988. System and method for avoiding complete index traversals in sequential and almost sequential index probes. U.S. Patent 5748952. Filed May 10, 1995. Issued May 5, 1998.
 
3
4
 
5
BELADY, L. A. 1966. A study of replacement algorithms for a virtual-storage computer. IBM Syst. J. 5, 2, 78-101.
 
6
 
7
 
8
9
 
10
CHOU,H.T.AND DEWITT, D. J. 1985. An evaluation of buffer management strategies for relational database systems. In Proceedings of the International Conference on Very Large Data Bases ( VLDB) (Stockholm, Sweden, Aug. 1985), 127-141.
11
 
12
 
13
14
 
15
16
17
 
18
 
19
20
 
21
22
 
23
 
24
HILL, A. V. 1913. The combinations of haemoglobin with oxygen and carbon monoxide. Biochemistry J. 7, 471-480.
 
25
 
26
HSU,W.W.,SMITH,A.J.,AND YOUNG, H. C. 1999b. Results and data for 'Analysis of the I /O characteristics of production database workloads and the TPC benchmarks'. http://www. cs.berkeley.edu /~windsorh/DBChar.
 
27
 
28
 
29
IBM CORP. 1997a. DB2 for OS/390 V5 Installation Guide.
 
30
IBM CORP. 1997b. DB2 UDB V5 Administration Guide.
 
31
INTEL CORP. 1999. Intel extended server memory architecture (ESMA): Overcoming the 4 GB memory barrier. http://www.intel.com/procs/servers/pentiumiii/xeon/whitepapers/ESMA. htm.
 
32
 
33
KEARNS,J.P.AND DEFAZIO, S. 1983. Locality of reference in hierarchical database systems. IEEE Trans. Softw. Eng. 19, 2 (March), 128-134.
34
 
35
KING, W. F. 1971. Analysis of paging algorithms. In Proceedings of the IFIP Congress (Ljubljana, Yugoslavia, Aug. 1971), 485-490.
 
36
37
 
38
MCNUTT, B. 1991. A simple statistical model of cache reference locality, and its application to cache planning, measurement and control. In Proceedings of the CMG (Computer Measurement Group) Conference (Nashville, TN, Dec. 1991), 203-210.
 
39
MCNUTT, B. 1995. MVS DASD survey: Results and trends. In Proceedings of the CMG (Computer Measurement Group) Conference (Nashville, TN, Dec. 1995), 658-667.
 
40
 
41
MOGUL, J. C. 1994. A better update policy. In Proceedings of the Summer 1994 USENIX Conference (Boston, MA, June 1994), 99-111.
42
43
44
45
46
47
 
48
 
49
RAGAZ,N.AND RODRIGUEZ-ROSELL, J. 1976. Empirical studies of storage management in a data base system. Res. Rep. RJ 1834, IBM Research Laboratory, San Jose, CA, Oct. 1976.
50
51
 
52
RODRIGUEZ-ROSELL, J. 1976. Empirical data reference behavior in data base systems. IEEE Computer 9, 11 (Nov.), 9-13.
53
 
54
55
56
 
57
SINGHAL,V.AND SMITH, A. J. 1997. Analysis of locking behavior in three real database systems. VLDB J. 6, 1 (Jan.), 40-52. Extended version available as Tech. Rep. CSD-94-801, Computer Science Div., Univ. of California, Berkeley, CA, Apr. 1994.
 
58
SMITH, A. J. 1976. Analysis of the optimal, look-ahead demand paging algorithms. SIAM J. Comput. 5, 4 (Dec.), 743-757.
59
60
 
61
SMITH, A. J. 1994. Trace driven simulation in research on computer architecture and operating systems. In Proceedings of the Conference on New Directions in Simulation for Manufacturing and Communications (Tokyo, Japan, Aug. 1994), 43-49.
62
 
63
TENG,J.Z.AND GUMAER, R. A. 1984. Managing IBM Database 2 buffers to maximize performance. IBM Syst. J. 23, 2, 211-218.
 
64
 
65
TPC. 1997a. TPC Benchmark TM C Standard Specification Revision 3.3.2. Transaction Processing Performance Council.
 
66
TPC. 1997b. TPC Benchmark TM D Standard Specification Revision 1.3.1. Transaction Processing Performance Council.
 
67
TPC. 1999a. TPC Benchmark TM H Standard Specification Revision 1.1.0. Transaction Processing Performance Council.
 
68
TPC. 1999b. TPC Benchmark TM R Standard Specification Revision 1.0.1. Transaction Processing Performance Council.
69
 
70
TUEL,JR., W. G. 1976. An analysis of buffer paging in virtual storage systems. IBM J. Res. Dev. 20, 5 (Sept.), 518-520.
 
71
TUEL,JR., W. G. AND RODRIGUEZ-ROSELL, J. 1975. A methodology for the evaluation of data base systems. Res. Rep. RJ 1668, IBM Research Laboratory, San Jose, CA, Oct. 1975.
72
73
 
74
VISHLITZKY,N.AND OFEK, Y. 1988. Sequential cache management system utilizing the establishment of a microcache and managing the contents of such according to a threshold comparison. U.S. Patent 5706467. Filed Sep 5, 1995. Issued Jan 6, 1998.
 
75
WELCH, B. B. 1991. Measured performance of caching in the Sprite network file system. Comput. Syst. 4, 3 (Summer), 315-342.
76
 
77
 
78
ZHOU, S., DA COSTA, H., AND SMITH, A. J. 1985. A file system tracing package for Berkeley UNIX.In Proceedings of the 10th Usenix Conference (Portland, OR, June 1985), 407-419.
 
79
ZIVKOV,B.T.AND SMITH, A. J. 1997. Disk cache design and performance as evaluated in large timesharing and database systems. In Proceedings of the CMG (Computer Measurement Group) Conference (Orlando, FL, Dec. 1997), 639-658.

CITED BY  10

ADDITIONAL RESOURCES

Technical reports, additional information and data related to the paper are available at http://www.cs.berkeley.edu/~windsorh.


Collaborative Colleagues:
Windsor W. Hsu: colleagues
Alan Jay Smith: colleagues
Honesty C. Young: colleagues