|
ABSTRACT
The problem of information disclosure has attracted much interest from the research community in recent years. When disclosing information, the challenge is to provide as much information as possible (optimality) while guaranteeing a desired safety property for privacy (such as l-diversity). A typical disclosure algorithm uses a sequence of disclosure schemas to output generalizations in the nonincreasing order of data utility; the algorithm releases the first generalization that satisfies the safety property. In this paper, we assert that the desired safety property cannot always be guaranteed if an adversary has the knowledge of the underlying disclosure algorithm. We propose a model for the additional information disclosed by an algorithm based on the definition of deterministic disclosure function (DDF), and provide definitions of p-safe and p-optimal DDFs. We give an analysis for the complexity to compute a p-optimal DDF. We show that deciding whether a DDF is p-optimal is an NP-hard problem, and only under specific conditions, we can solve the problem in polynomial time with respect to the size of the set of all possible database instances and the length of the disclosure generalization sequence. We then consider the problem of microdata disclosure and the safety condition of l-diversity. We relax the notion of p-optimality to weak p-optimality, and develop a weak p-optimal algorithm which is polynomial in the size of the original table and the length of the generalization sequence.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
A. Dobra and S. E. Feinberg. Bounding entries in multi-way contingency tables given a set of marginal totals. In Foundations of Statistical Inference: Proceedings of the Shoresh Conference 2000. Springer Verlag, 2003.
|
| |
2
|
|
| |
3
|
A. Slavkovic and S. E. Feinberg. Bounds for cell entries in two-way tables given conditional relative frequencies. Privacy in Statistical Databases, 2004.
|
| |
4
|
V. Ciriani, S. D. C. di Vimercati, S. Foresti, and P. Samarati. k-anonymity. In Secure Data Management in Decentralized Systems (edited by T. Yuand S. Jajodia). Springer-Verlag, 2007.
|
 |
5
|
|
 |
6
|
|
 |
7
|
|
| |
8
|
G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigrahy, D. Thomas, and A. Zhu. k-anonymity: Algorithms and hardness. Technical report, Stanford University, 2004.
|
 |
9
|
|
| |
10
|
G. T. Duncan and S. E. Feinberg. Obtaining information while preserving privacy: A markov perturbation method for tabular data. In Joint Statistical Meetings. Anaheim,CA, 1997.
|
| |
11
|
I. P. Fellegi. On the question of statistical confidentiality. Journal of the American Statistical Association, 67(337):7--18, 1993.
|
 |
12
|
Jon Kleinberg , Christos Papadimitriou , Prabhakar Raghavan, Auditing Boolean attributes, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.86-91, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335210]
|
| |
13
|
J. Schlorer. Identification and retrieval of personal records from a statistical bank. In Methods Info. Med., 1975.
|
 |
14
|
|
 |
15
|
|
| |
16
|
L. H. Cox. Solving confidentiality protection problems in tabulations using network optimization: A network model for cell suppression in the u.s. economic censuses. In Proceedings of the Internatinal Seminar on Statistical Confidentiality, pages 229--245. International Statistical Institute, Dublin, 1982.
|
| |
17
|
L. H. Cox. New results in disclosure avoidance for tabulations. In International Statistical Institute Proceedings of the 46th Session, pages 83--84. Tokyo, 1987.
|
| |
18
|
L. H. Cox. Suppression, methodology and statistical disclosure control. Journal of the American Statistical Association, 90:1453--1462, 1995.
|
| |
19
|
|
 |
20
|
|
 |
21
|
|
| |
22
|
P. Diaconis and B. Sturmfels. Algebraic algorithms for sampling from conditional distributions. Annals of Statistics, 1:363--397, 1998.
|
| |
23
|
|
| |
24
|
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and it senforcement through generalization and suppression. Technical report, CMU, SRI, 1998.
|
| |
25
|
|
| |
26
|
S. Chawla, C. Dwork, F. McSherry, A. Smith, and H. Wee. Toward privacy in public databases. In Theory of Cryptography Conference, 2005.
|
| |
27
|
T. Dalenius and S. Reiss. Data swapping: A technique for disclosure control. Journal of Statistical Planning and Inference, 6:73--85, 1982.
|
 |
28
|
|
|