ACM Home Page
Please provide us with feedback. Feedback
QAGen: generating query-aware test databases
Full text PdfPdf (321 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data table of contents
Beijing, China
SESSION: Benchmarking and performance evaluation table of contents
Pages: 341 - 352  
Year of Publication: 2007
ISBN:978-1-59593-686-8
Authors
Carsten Binnig  ETH Zurich, Zurich, Switzerland
Donald Kossmann  ETH Zurich, Zurich, Switzerland
Eric Lo  ETH Zurich, Zurich, Switzerland
M. Tamer Özsu  University of Waterloo, Waterloo, Canada
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 122,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1247480.1247520
What is a DOI?

ABSTRACT

Today, a common methodology for testing a database management system (DBMS) is to generate a set of test databases and then execute queries on top of them. However, for DBMS testing, it would be a big advantage if we can control the input and/or the output (e.g., the cardinality) of each individual operator of a test query for a particular test case. Unfortunately, current database generators generate databases independent of queries. As a result, it is hard to guarantee that executing the test query on the generated test databases can obtain the desired (intermediate) query results that match the test case. In this paper, we propose a novel way for DBMS testing. Instead of first generating a test database and then seeing how well it matches a particular test case (or otherwise use a trial-and-error approach to generate another test database), we propose to generate a query-aware database for each test case. To that end, we designed a query-aware test database generator called QAGen. In addition to the database schema and the set of basic constraints defined on the base tables, QAGen takes the query and the set of constraints defined on the query as input, and generates a query-aware test database as output. The generated database guarantees that the test query can get the desired (intermediate) query results as defined in the test case. This approach of testing facilitates a wide range of DBMS testing tasks such as testing of memory managers and testing the cardinality estimation components of query optimizers.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
DTM Data Generator. http://www.sqledit.com/dg/.
 
2
IBM DB2 Test Database Generator. http://www-306.ibm.com/software/data/db2imstools/db2tools/db2tdbg/.
 
3
International Organization for Standardization (ISO). Information Technology-Database Language SQL, 1999.
 
4
TPC benchmark H. http://www.tpc.org/tpch.
 
5
C. Binnig, D. Kossmann, and E. Lo. Reverse query processing. In ICDE, 2007.
 
6
C. Binnig, D. Kossmann, E. Lo, and M. T. Özsu. QAGen: Generating Query-Aware Test Databases. ETH Zurich Technical Report, 2007.
 
7
 
8
N. Bruno, S. Chaudhuri, and D. Thomas. Generating Queries with Cardinality Constraints for DBMS Testing. TKDE, 2006.
 
9
S. Chaudhuri and V. Narasayya. TPC-D data generation with skew.
 
10
E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. 2000.
11
 
12
B. Cook, D. Kroening, and N. Sharygina. Cogent: Accurate theorem proving for program verification. In CAV, pages 296--300, 2005.
 
13
R. A. Ganski and H. K. T. Wong. Optimization of nested SQL queries revisited. In SIGMOD, pages 23--33, 1987.
 
14
M. R. Garey and D. S. Johnson. Computers and Intractability; A Guide to the Theory of NP-Completeness. 1990.
15
 
16
J. Gray, P. Sundaresan, S. Englert, K. Baclawski, and P. J. Weinberger. Quickly generating billion-record synthetic databases. In SIGMOD, pages 243--252, 1994.
 
17
18
19
20
 
21
 
22
M. Poess and J. M. Stephens. Generating thousand benchmark queries in seconds. In VLDB, pages 1045--1053, 2004.
 
23
 
24
J. M. Stephens and M. Poess. Mudd: a multi-dimensional data generator. In WOSP, pages 104--109, 2004.
 
25
G. Zipf. Human Behaviour and the Principle of Least Effort. 1949.

CITED BY  9

Collaborative Colleagues:
Carsten Binnig: colleagues
Donald Kossmann: colleagues
Eric Lo: colleagues
M. Tamer Özsu: colleagues