|
ABSTRACT
Traditionally, application software developers carry out their tests on their own local development databases. However, such local databases usually have only a small number of sample data and hence cannot simulate satisfactorily a live environment, especially in terms of performance and scalability testing. On the other hand, the idea of testing applications over live production databases is increasingly problematic in most situations primarily due to the fact that such use of live production databases has the potential to expose sensitive data to an unauthorized tester and to incorrectly update information in the underlying database. In this paper, we investigate techniques to generate mock databases for application software testing without revealing any confidential information from the live production databases. Specifically, we will design mechanisms to create the deterministic rule set R, non-deterministic rule set N R, and statistic data set S for a live production database. We will then build a security Analyzer which will process the triplet <R',N R',S'> together with security requirements (security policy) and output a new triplet <R',N R',S'> The security Analyzer will guarantee that no confidential information could be inferred from the new triplet <R',N R',S'> The mock database generated from this new triplet can simulate the live environment for testing purpose, while maintaining the privacy of data in the original database.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
L. Brankovic, and V. Estivill-Castro. Privacy issues in knowledge discovery and data mining. In Proceedings of 1st Australian Institute of Computer Ethics Conference, July, 1999.
|
 |
4
|
David Chays , Saikat Dan , Phyllis G. Frankl , Filippos I. Vokolos , Elaine J. Weber, A framework for testing database applications, Proceedings of the 2000 ACM SIGSOFT international symposium on Software testing and analysis, p.147-157, August 21-24, 2000, Portland, Oregon, United States
|
| |
5
|
|
| |
6
|
|
| |
7
|
R. A. Davies, R. J. Beynon, and B. F. Jones. Automating the testing of databases. In Proceedings of the first International Workshop on Automated Program Analysis, Testing and Verification, June 2000.
|
 |
8
|
|
| |
9
|
J. Domingo-Ferrer. Current directions in statistical data protection. In Proceeding of Statistical Data Protection, 1998.
|
| |
10
|
|
 |
11
|
Alexandre Evfimievski , Ramakrishnan Srikant , Rakesh Agrawal , Johannes Gehrke, Privacy preserving mining of association rules, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775080]
|
| |
12
|
|
| |
13
|
|
| |
14
|
S. Goldwasser and S. Micali. Probabilistic encryption. Journal of Computer and System Science28 (2):270--299, 1984.
|
 |
15
|
Arnaud Gotlieb , Bernard Botella , Michel Rueher, Automatic test data generation using constraint solving techniques, Proceedings of the 1998 ACM SIGSOFT international symposium on Software testing and analysis, p.53-62, March 02-04, 1998, Clearwater Beach, Florida, United States
|
 |
16
|
Jim Gray , Prakash Sundaresan , Susanne Englert , Ken Baclawski , Peter J. Weinberger, Quickly generating billion-record synthetic databases, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.243-252, May 24-27, 1994, Minneapolis, Minnesota, United States
|
| |
17
|
M. Kantarcioglu, and C. Clifton. Privacy preserving distributed mining of association rules on horizontally partitioned data. In ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, pp. 24--31, June 2002.
|
| |
18
|
J. J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In Proceedings of the section on survey research methods, American Statistical Association, 1986.
|
| |
19
|
J. J. Kim, and W. E. Winkler. Masking microdata files. Report of Bureau of the Census, 1997.
|
| |
20
|
S. Kirkpatrick, S. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science 220(4958):671--680.
|
| |
21
|
|
| |
22
|
Niagara. http://www.cs.wisc.edu/niagara/datagendownload.html
|
| |
23
|
|
| |
24
|
Quest. http://www.almaden.ibm.com/software/quest/
|
| |
25
|
S. Rizvi, and J. Haritsa. Privacy preserving association rule mining. In Proceedings of 28th International Conference on Very Large Data Bases. Aug, 2002.
|
| |
26
|
C. J. Skinner. On identification disclosure and prediction disclosure for microdata. Statistica Neerlandica, 44:21--32, 1992.
|
 |
27
|
|
| |
28
|
B. Malin, L. Sweeney, and E. Newton. Trail re-identification: learning who you are from where you have been. Proc. LIDAP-WP12. Carnegie Mellon University, 2003.
|
| |
29
|
Transaction Processing Performance Council. TPC-Benchmark C. 1998.
|
| |
30
|
Edward Tsang. Foundations of constraint satisfaction. Academic Press, 1993.
|
 |
31
|
|
 |
32
|
|
| |
33
|
|
| |
34
|
|
| |
35
|
A. Yao. How to generate and exchange secrets. In Proceedings of the 27th IEEE symposium on Foundations of Computer Science, pp. 162--167, 1986.
|
| |
36
|
A. Yao. Theory and application of trap-door functions. In Proc. of 23rd IEEE Symposium on Foundation of Computer Science, page 80--91, 1982.
|
CITED BY 4
|
|
|
|
|
|
|
|
C. A. Taylor , M. S. Gittens , A. V. Miranskyy, A case study in database reliability: component types, usage profiles, and testing, Proceedings of the 1st international workshop on Testing database systems, June 13-13, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|