|
ABSTRACT
The output of a data mining algorithm is only as good as its inputs, and individuals are often unwilling to provide accurate data about sensitive topics such as medical history and personal finance. Individuals maybe willing to share their data, but only if they are assured that it will be used in an aggregate study and that it cannot be linked back to them. Protocols for anonymity-preserving data collection provide this assurance, in the absence of trusted parties, by allowing a set of mutually distrustful respondents to anonymously contribute data to an untrusted data miner.To effectively provide anonymity, a data collection protocol must be collusion resistant, which means that even if all dishonest respondents collude with a dishonest data miner in an attempt to learn the associations between honest respondents and their responses, they will be unable to do so. To achieve collusion resistance, previously proposed protocols for anonymity-preserving data collection have quadratically many communication rounds in the number of respondents, and employ (sometimes incorrectly) complicated cryptographic techniques such as zero-knowledge proofs.We describe a new protocol for anonymity-preserving, collusion resistant data collection. Our protocol has linearly many communication rounds, and achieves collusion resistance without relying on zero-knowledge proofs. This makes it especially suitable for data mining scenarios with a large number of respondents.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
M. Bellare and P. Rogaway. Introduction to modern cryptography, 2005. http://www-cse.ucsd.edu/users/mihir/cse207/classnotes.html.
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
R. Dingledine, N. Mathewson, and P. Syverson. Tor: the second-generation onion router. In Proc. 13th USENIX Security Symposium, pages 303--320, 2004.
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
Y. Lindell and B. Pinkas. Privacy preserving data mining. J. Cryptology, 15(3):177--206, 2002.
|
| |
15
|
A. Neff. Verifiable mixing (shuffling) of ElGamal pairs. url http://www.votehere.net/vhti/documentation/egshuf.pdf.
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
D. Wikström. Four practical attacks for "optimistic mixing for exit-polls". Technical Report T2003-04, Swedish Institute of Computer Sciences (SICS), 2003.
|
 |
20
|
|
 |
21
|
|
| |
22
|
A. Yao. How to generate and exchange secrets. In FOCS '86: Proc. of the 27th IEEE Annual Symposium on Foundations of Computer Science, pages 162--167, 1986.
|
CITED BY 4
|
|
|
|
|
|
|
|
|
|
|
Graham Cormode , Divesh Srivastava, Anonymized data: generation, models, usage, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|