ACM Home Page
Please provide us with feedback. Feedback
Privacy-preserving reinforcement learning
Full text PdfPdf (334 KB)
Source ICML; Vol. 307 archive
Proceedings of the 25th international conference on Machine learning table of contents
Helsinki, Finland
Pages 864-871  
Year of Publication: 2008
ISBN:978-1-60558-205-4
Authors
Jun Sakuma  Tokyo Institute of Technology, Yokohama, Japan
Shigenobu Kobayashi  Tokyo Institute of Technology, Yokohama, Japan
Rebecca N. Wright  Rutgers University, Piscataway, NJ
Sponsors
: Yahoo!
: Xerox
IBM : IBM
: NSF
Microsoft Research : Microsoft Research
: Machine Learning Journal/Springer
: Pascal
: University of Helsinki
: Federation of Finnish Learned Societies
: Intel Corporation
: Google
: Helsinki Institute for Information Technology
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 61,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390156.1390265
What is a DOI?

ABSTRACT

We consider the problem of distributed reinforcement learning (DRL) from private perceptions. In our setting, agents' perceptions, such as states, rewards, and actions, are not only distributed but also should be kept private. Conventional DRL algorithms can handle multiple agents, but do not necessarily guarantee privacy preservation and may not guarantee optimality. In this work, we design cryptographic solutions that achieve optimal policies without requiring the agents to share their private information.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Cogill, R., Rotkowitz, M., Van Roy, B., & Lall, S. (2006). An Approximate Dynamic Programming Approach to Decentralized Control of Stochastic Systems. LNCIS, 329, 243--256.
 
3
Dåmgard, I., & Jurik, M. (2001). A Generalisation, a Simplification and Some Applications of Paillier's Probabilistic Public-Key System. Public Key Cryptography 2001. Springer.
 
4
 
5
6
 
7
Kearns, M., Tan, J., & Wortman, J. (2007). Privacy-Preserving Belief Propagation and Sampling. NIPS 20.
 
8
Lindell, Y., & Pinkas, B. (2002). Privacy Preserving Data Mining. Journal of Cryptology, 15, 177--206.
 
9
 
10
Moallemi, C. C., & Roy, B. V. (2004). Distributed optimization in adaptive networks. NIPS 16.
 
11
Sakuma, J., & Kobayashi, S. (2008). Large-scale kmeans Clustering with User-Centric Privacy Preservation. Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD) 2008, to appear.
 
12
 
13
 
14
Watkins, C. (1989). Learning from Delayed Rewards. Cambridge University.
 
15
16
17


Collaborative Colleagues:
Jun Sakuma: colleagues
Shigenobu Kobayashi: colleagues
Rebecca N. Wright: colleagues