|
ABSTRACT
We consider a multi-round auction setting motivated by pay-per-click auctions for Internet advertising. In each round the auctioneer selects an advertiser and shows her ad, which is then either clicked or not. An advertiser derives value from clicks; the value of a click is her private information. Initially, neither the auctioneer nor the advertisers have any information about the likelihood of clicks on the advertisements. The auctioneer's goal is to design a (dominant strategies) truthful mechanism that (approximately) maximizes the social welfare. If the advertisers bid their true private values, our problem is equivalent to the multi-armed bandit problem, and thus can be viewed as a strategic version of the latter. In particular, for both problems the quality of an algorithm can be characterized by regret, the difference in social welfare between the algorithm and the benchmark which always selects the same "best" advertisement. We investigate how the design of multi-armed bandit algorithms is affected by the restriction that the resulting mechanism must be truthful. We find that truthful mechanisms have certain strong structural properties -- essentially, they must separate exploration from exploitation -- and they incur much higher regret than the optimal multi-armed bandit algorithms. Moreover, we provide a truthful mechanism which (essentially) matches our lower bound on regret.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Jacob Abernethy, Elad Hazan, and Alexander Rakhlin. Competing in the dark: An efficient algorithm for bandit linear optimization. In Conf. on Learning Theory (COLT), pages 263--274, 2008.
|
 |
2
|
Gagan Aggarwal , Ashish Goel , Rajeev Motwani, Truthful auctions for pricing search keywords, Proceedings of the 7th ACM conference on Electronic commerce, p.1-7, June 11-15, 2006, Ann Arbor, Michigan, USA
[doi> 10.1145/1134707.1134708]
|
| |
3
|
|
| |
4
|
|
| |
5
|
Susan Athey and Ilya Segal. An efficient dynamic mechanism. Available from http://www.stanford.edu/~isegal/agv.pdf, March 2007.
|
| |
6
|
|
| |
7
|
|
| |
8
|
Moshe Babaioff, Yogeshwer Sharma, and Aleksandrs Slivkins. Characterizing Truthful Multi-Armed Bandit Mechanisms. Technical Report http://arxiv.org/abs/0812.2291, December 2008. Minor revision February 2009.
|
| |
9
|
Peter L. Bartlett, Varsha Dani, Thomas Hayes, Sham Kakade, Alexander Rakhlin, and Ambuj Tewari. High-probability regret bounds for bandit online linear optimization. In Conf. on Learning Theory (COLT), pages 335--342, 2008.
|
| |
10
|
|
| |
11
|
Dirk Bergemann and Juuso Välimäki. Efficient dynamic auctions. Available from cowles.econ.yale.edu/P/cd/d15b/d1584.pdf, October 2006.
|
| |
12
|
Donald Berry and Bert Fristedt. Bandit problems: sequential allocation of experiments. Chapman&Hall, 1985.
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. American Economic Review, 97(1):242--259, March 2007.
|
| |
18
|
J.C. Gittins. Multi-Armed Bandit Allocation Indices. John Wiley&Sons, 1989.
|
 |
19
|
|
| |
20
|
Nicole Immorlica, Kamal Jain, Mohammad Mahdian, and Kunal Talwar. Click fraud resistant methods for learning click-through rates. In Intl. Workshop On Internet And Network Economics (WINE), pages 34--45, 2005.
|
| |
21
|
|
| |
22
|
Robert Kleinberg. Lecture notes: CS683: Learning, Games, and Electronic Markets (week 8), Spring 2007. Available at http://www.cs.cornell.edu/courses/cs683/2007sp.
|
| |
23
|
Robert Kleinberg. Lecture notes: CS683: Learning, Games, and Electronic Markets (week 9), Spring 2007. Available at http://www.cs.cornell.edu/courses/cs683/2007sp.
|
 |
24
|
|
| |
25
|
Sebastien Lahaie, David M. Pennock, Amin Saberi, and Rakesh V. Vohra. In N. Nisan, T. Roughgarden, E. Tardos, and V. Vazirani (eds.) Chapter 28, Sponsored search auctions. Cambridge University Press., 2007.
|
| |
26
|
T.L. Lai and Herbert Robbins. Asymptotically efficient Adaptive Allocation Rules. Advances in Applied Mathematics, 6:4--22, 1985.
|
| |
27
|
|
| |
28
|
|
 |
29
|
|
| |
30
|
Roger B. Myerson. Optimal Auction Design. Mathematics of Operations Research, 6:58--73, 1981.
|
 |
31
|
|
| |
32
|
N. Nisan and A. Ronen. Algorithmic Mechanism Design. Games and Economic Behavior, 35(1-2):166--196, 2001.
|
| |
33
|
|
| |
34
|
|
| |
35
|
Tim Roughgarden. An algorithmic game theory primer. IFIP International Conference on Theoretical Computer Science (TCS). An invited survey., 2008.
|
| |
36
|
Hal R. Varian. Position auctions. International Journal of Industrial Organization, 25(6):1163--1178, December 2007.
|
|