ACM Home Page
Please provide us with feedback. Feedback
Characterizing truthful multi-armed bandit mechanisms: extended abstract
Full text PdfPdf (493 KB)
Source
Electronic Commerce archive
Proceedings of the tenth ACM conference on Electronic commerce table of contents
Stanford, California, USA
SESSION: Session 3 table of contents
Pages 79-88  
Year of Publication: 2009
ISBN:978-1-60558-458-4
Authors
Moshe Babaioff  Microsoft Research, Mountain View, CA, USA
Yogeshwer Sharma  Cornell University, Ithaca, NY, USA
Aleksandrs Slivkins  Microsoft Research, Mountain View, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGEcom: ACM Special Interest Group on Electronic Commerce
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 33,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1566374.1566386
What is a DOI?

ABSTRACT

We consider a multi-round auction setting motivated by pay-per-click auctions for Internet advertising. In each round the auctioneer selects an advertiser and shows her ad, which is then either clicked or not. An advertiser derives value from clicks; the value of a click is her private information. Initially, neither the auctioneer nor the advertisers have any information about the likelihood of clicks on the advertisements. The auctioneer's goal is to design a (dominant strategies) truthful mechanism that (approximately) maximizes the social welfare.

If the advertisers bid their true private values, our problem is equivalent to the multi-armed bandit problem, and thus can be viewed as a strategic version of the latter. In particular, for both problems the quality of an algorithm can be characterized by regret, the difference in social welfare between the algorithm and the benchmark which always selects the same "best" advertisement. We investigate how the design of multi-armed bandit algorithms is affected by the restriction that the resulting mechanism must be truthful. We find that truthful mechanisms have certain strong structural properties -- essentially, they must separate exploration from exploitation -- and they incur much higher regret than the optimal multi-armed bandit algorithms. Moreover, we provide a truthful mechanism which (essentially) matches our lower bound on regret.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Jacob Abernethy, Elad Hazan, and Alexander Rakhlin. Competing in the dark: An efficient algorithm for bandit linear optimization. In Conf. on Learning Theory (COLT), pages 263--274, 2008.
2
 
3
 
4
 
5
Susan Athey and Ilya Segal. An efficient dynamic mechanism. Available from http://www.stanford.edu/~isegal/agv.pdf, March 2007.
 
6
 
7
 
8
Moshe Babaioff, Yogeshwer Sharma, and Aleksandrs Slivkins. Characterizing Truthful Multi-Armed Bandit Mechanisms. Technical Report http://arxiv.org/abs/0812.2291, December 2008. Minor revision February 2009.
 
9
Peter L. Bartlett, Varsha Dani, Thomas Hayes, Sham Kakade, Alexander Rakhlin, and Ambuj Tewari. High-probability regret bounds for bandit online linear optimization. In Conf. on Learning Theory (COLT), pages 335--342, 2008.
 
10
 
11
Dirk Bergemann and Juuso Välimäki. Efficient dynamic auctions. Available from cowles.econ.yale.edu/P/cd/d15b/d1584.pdf, October 2006.
 
12
Donald Berry and Bert Fristedt. Bandit problems: sequential allocation of experiments. Chapman&Hall, 1985.
 
13
14
15
16
 
17
Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American Economic Review, 97(1):242--259, March 2007.
 
18
J.C. Gittins. Multi-Armed Bandit Allocation Indices. John Wiley&Sons, 1989.
19
 
20
Nicole Immorlica, Kamal Jain, Mohammad Mahdian, and Kunal Talwar. Click fraud resistant methods for learning click-through rates. In Intl. Workshop On Internet And Network Economics (WINE), pages 34--45, 2005.
 
21
 
22
Robert Kleinberg. Lecture notes: CS683: Learning, Games, and Electronic Markets (week 8), Spring 2007. Available at http://www.cs.cornell.edu/courses/cs683/2007sp.
 
23
Robert Kleinberg. Lecture notes: CS683: Learning, Games, and Electronic Markets (week 9), Spring 2007. Available at http://www.cs.cornell.edu/courses/cs683/2007sp.
24
 
25
Sebastien Lahaie, David M. Pennock, Amin Saberi, and Rakesh V. Vohra. In N. Nisan, T. Roughgarden, E. Tardos, and V. Vazirani (eds.) Chapter 28, Sponsored search auctions. Cambridge University Press., 2007.
 
26
T.L. Lai and Herbert Robbins. Asymptotically efficient Adaptive Allocation Rules. Advances in Applied Mathematics, 6:4--22, 1985.
 
27
 
28
29
 
30
Roger B. Myerson. Optimal Auction Design. Mathematics of Operations Research, 6:58--73, 1981.
31
 
32
N. Nisan and A. Ronen. Algorithmic Mechanism Design. Games and Economic Behavior, 35(1-2):166--196, 2001.
 
33
 
34
 
35
Tim Roughgarden. An algorithmic game theory primer. IFIP International Conference on Theoretical Computer Science (TCS). An invited survey., 2008.
 
36
Hal R. Varian. Position auctions. International Journal of Industrial Organization, 25(6):1163--1178, December 2007.


Collaborative Colleagues:
Moshe Babaioff: colleagues
Yogeshwer Sharma: colleagues
Aleksandrs Slivkins: colleagues