| Efficient learning algorithms for changing environments |
| Full text |
Pdf
(663 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 382
archive
Proceedings of the 26th Annual International Conference on Machine Learning
table of contents
Montreal, Quebec, Canada
Pages 393-400
Year of Publication: 2009
ISBN:978-1-60558-516-1
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 51, Citation Count: 0
|
|
|
ABSTRACT
We study online learning in an oblivious changing environment. The standard measure of regret bounds the difference between the cost of the online learner and the best decision in hindsight. Hence, regret minimizing algorithms tend to converge to the static best optimum, clearly a suboptimal behavior in changing environments. On the other hand, various metrics proposed to strengthen regret and allow for more dynamic algorithms produce inefficient algorithms. We propose a different performance metric which strengthens the standard metric of regret and measures performance with respect to a changing comparator. We then describe a series of data-streaming-based reductions which transform algorithms for minimizing (standard) regret into adaptive algorithms albeit incurring only poly-logarithmic computational overhead. Using this reduction, we obtain efficient low adaptive-regret algorithms for the problem of online convex optimization. This can be applied to various learning scenarios, i.e. online portfolio selection, for which we describe experimental results showing the advantage of adaptivity.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Amit Agarwal , Elad Hazan , Satyen Kale , Robert E. Schapire, Algorithms for portfolio management based on the Newton method, Proceedings of the 23rd international conference on Machine learning, p.9-16, June 25-29, 2006, Pittsburgh, Pennsylvania
[doi> 10.1145/1143844.1143846]
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
Cover, T. (1991). Universal portfolios. Math. Finance, 1, 1--19.
|
 |
6
|
Yoav Freund , Robert E. Schapire , Yoram Singer , Manfred K. Warmuth, Using and combining predictors that specialize, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.334-343, May 04-06, 1997, El Paso, Texas, United States
[doi> 10.1145/258533.258616]
|
| |
7
|
Parikshit Gopalan , T. S. Jayram , Robert Krauthgamer , Ravi Kumar, Estimating the sortedness of a data stream, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, p.318-327, January 07-09, 2007, New Orleans, Louisiana
|
| |
8
|
Hazan, E., Kalai, A., Kale, S., & Agarwal, A. (2006). Logarithmic regret algorithms for online convex optimization. Proceedings of 19th Annual Conference on Learning Theory, (COLT) (pp. 499--513).
|
| |
9
|
Hazan, E., & Seshadhri, C. (2007). Adaptive algorithms for online decision problems. Electronic Colloquium on Computational Complexity (ECCC), 14.
|
| |
10
|
|
| |
11
|
Kozat, S., & Singer, A. (2007). Universal constant rebalanced portfolios with switching. IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP) (pp. 1129--1132).
|
| |
12
|
Lehrer, E. (2003). A wide range no-regret theorem. Games and Economic Behavior, 42, 101--115.
|
| |
13
|
Woodruff, D. (2007). personal communications.
|
| |
14
|
Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. Proceedings of the Twentieth International Conference (ICML) (pp. 928--936).
|
|