ACM Home Page
Please provide us with feedback. Feedback
Privacy-preserving data mashup
Full text PdfPdf (672 KB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Privacy & security table of contents
Pages 228-239  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Authors
Noman Mohammed  Concordia University, Montreal, QC, Canada
Benjamin C. M. Fung  Concordia University, Montreal, QC, Canada
Ke Wang  Simon Fraser University, Burnaby, BC, Canada
Patrick C. K. Hung  University of Ontario Institute of Technology, Oshawa, ON, Canada
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 30,   Downloads (12 Months): 157,   Citation Count: 2
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516388
What is a DOI?

ABSTRACT

Mashup is a web technology that combines information from more than one source into a single web application. This technique provides a new platform for different data providers to flexibly integrate their expertise and deliver highly customizable services to their customers. Nonetheless, combining data from different sources could potentially reveal person-specific sensitive information. In this paper, we study and resolve a real-life privacy problem in a data mashup application for the financial industry in Sweden, and propose a privacy-preserving data mashup (PPMashup) algorithm to securely integrate private data from different data providers, whereas the integrated data still retains the essential information for supporting general data exploration or a specific data mining task, such as classification analysis. Experiments on real-life data suggest that our proposed method is effective for simultaneously preserving both privacy and information usefulness, and is scalable for handling large volume of data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. M. Abowd and J. Lane. New approaches to confidentiality protection: Synthetic data, remote access and research data centers. In Proc. of Privacy in Statistical Databases: CASC Project International Workshop (PSD 2004), pages 282--289, Barcelona, Spain, June 2004.
2
3
 
4
5
 
6
U. Dayal and H. Y. Hwang. View definition and generalization for database integration in a multidatabase systems. IEEE Transactions on Software Engineering, 10(6):628--645, 1984.
 
7
W. Du, Y. S. Han, and S. Chen. Privacy-preserving multivariate statistical analysis: Linear regression and classification. In Proc. of the 4th SDM, Florida, 2004.
 
8
9
 
10
W. A. Fuller. Masking procedures for microdata disclosure limitation. Official Statistics, 9(2):383--406, 1993.
11
 
12
 
13
 
14
J. Goguen and J. Meseguer. Unwinding and inference control. In Proc. of the IEEE Symposium on Security and Privacy, Oakland, CA, 1984.
 
15
T. Hinke. Inference aggregation detection in database management systems. In Proc. of the IEEE Symposium on Security and Privacy, pages 96--107, Oakland, CA, April 1988.
 
16
T. Hinke, H. Degulach, and A. Chandrasekhar. A fast algorithm for detecting second paths in database inference analysis. Journal of Computer Security, 1995.
 
17
R. D. Hof. Mix, match, and mutate. Business Week, July 2005.
 
18
A. Hundepool and L. Willenborg. μ- and τ-argus: Software for statistical disclosure control. In Proc. of the 3rd International Seminar on Statistical Confidentiality, 1996.
19
 
20
S. Jajodia and C. Meadows. Inference problems in multilevel database management systems. IEEE Information Security: An Integrated Collection of Essays, pages 570--584, 1995.
 
21
W. Jiang and C. Clifton. Privacy-preserving distributed k-anonymity. In Proc. of the 19th Annual IFIP WG 11.3 Working Conference on Data and Applications Security, pages 166--177, August 2005.
 
22
 
23
J. Kim and W. Winkler. Masking microdata files. In Proc. of the Section on Survey Research Methods, pages 114--119, 1995.
24
25
 
26
J. M. Mateo-Sanz, A. Martínez-Ballesté, and J. Domingo-Ferrer. Fast generation of accurate synthetic microdata. In Proceedings of Privacy in Statistical Databases: CASC Project International Workshop (PSD 2004), pages 298--306, Barcelona, Spain, June 2004.
27
 
28
D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of machine learning databases, 1998. http://ics.uci.edu/~mlearn/MLRepository.html.
 
29
N. Nisan. Algorithms for selfish agents. In Proceedings of the 16th Symposium on Theoretical Aspects of Computer Science, Trier, Germany, March 1999.
 
30
S. Pohlig and M. Hellman. An improved algorithm for computing logarithms over gf(p) and its cryptographic significance. IEEE Transactions on Information Theory, IT-24:106--110, 1978.
 
31
 
32
33
 
34
35
36
37
 
38
39
40
 
41
42
43
44
 
45
Z. Yang, S. Zhong, and R. N. Wright. Privacy-preserving classification of customer data without loss of accuracy. In Proc. of the 5th SDM, pages 92--102, 2005.
 
46

Collaborative Colleagues:
Noman Mohammed: colleagues
Benjamin C. M. Fung: colleagues
Ke Wang: colleagues
Patrick C. K. Hung: colleagues