\`x^2+y_1+z_12^34\`
Article Contents
Article Contents

A continuous-time approach to online optimization

Abstract / Introduction Full Text(HTML) Figure(2) / Table(1) Related Papers Cited by
  • We consider a family of mirror descent strategies for online optimization in continuous-time and we show that they lead to no regret. From a more traditional, discrete-time viewpoint, this continuous-time approach allows us to derive the no-regret properties of a large class of discrete-time algorithms including as special cases the exponential weights algorithm, online mirror descent, smooth fictitious play and vanishingly smooth fictitious play. In so doing, we obtain a unified view of many classical regret bounds, and we show that they can be decomposed into a term stemming from continuous-time considerations and a term which measures the disparity between discrete and continuous time. This generalizes the continuous-time based analysis of the exponential weights algorithm from [29]. As a result, we obtain a general class of infinite horizon learning strategies that guarantee an $\mathcal{O}(n^{-1/2})$ regret bound without having to resort to a doubling trick.

    Mathematics Subject Classification: Primary: 68Q32, 68T05, 91A26; Secondary: 90C25.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Graphical illustration of the greedy (dashed) and lazy (solid) branches of the projected subgradient (PSG) method

    Figure 2.  Greedy and Lazy Mirror Descent with γn = 1

    Table 1.  Summary of the algorithms discussed in Section 6.The suffix "L" indicates a "lazy" variant; the INPUT column stands for the stream of payoff vectors which is used as input for the algorithm and the NORM column specifies the norm of the ambient space; finally, $\xi_{n}$ represents a zero-mean stochastic process with values in $\mathbb{R}^{d}$

    ALGORITHM $\mathcal{C}$ $h(x)$ $\eta _{n}$ INPUT NORM
    EW $\Delta_d$ $\sum_{i} x_{i}\log x_{i}$ CONSTANT $v_{n}$ $\ell^{1}$
    EW' $\Delta_{d}$ $\sum_{i} x_{i}\log x_{i}$ $\eta /\sqrt{n}$ $v_{n}$ $\ell^{1}$
    SFP $\Delta_d$ ANY $\eta /n$ $v_{n}$ $\ell^{1}$
    VSFP $\Delta_d$ ANY $\eta n^{-\alpha} \;\;(0<\alpha<1)$ $v_{n}$ $\ell^{1}$
    OGD-L ANY $\frac{1}{2} \left\| x \right\|_2^2 $ CONSTANT $-\nabla f_{n}(x_{n})$ $\ell^{2}$
    OMD-L ANY ANY CONSTANT $-\nabla f_{n}(x_{n})$ ANY
    PSG-L ANY $\frac{1}{2} \left\| x \right\|_2^2 $ $1$ $-\gamma_{n}\nabla f(x_{n})$ $\ell^{2}$
    MD-L ANY ANY $1$ $-\gamma_{n} \nabla f(x_{n})$ ANY
    RSA-L ANY ANY $1$ $-\gamma_{n} (\nabla f(x_{n})+\xi_n)$ ANY
    SPSG-L ANY $\frac{1}{2} \left\| x \right\|_2^2 $ $1$ $-\gamma_{n}(\nabla f(x_{n})+\xi_n)$ $\ell^{2}$
     | Show Table
    DownLoad: CSV
  • [1] P. AuerN. Cesa-Bianchi and C. Gentile, Adaptive and self-confident on-line learning algorithms, Journal of Computer and System Sciences, 64 (2002), 48-75.  doi: 10.1006/jcss.2001.1795.
    [2] A. Beck and M. Teboulle, Mirror descent and nonlinear projected subgradient methods for convex optimization, Operations Research Letters, 31 (2003), 167-175.  doi: 10.1016/S0167-6377(02)00231-6.
    [3] M. Benaïm and M. Faure, Consistency of vanishingly smooth fictitious play, Mathematics of Operations Research, 38 (2013), 437-450.  doi: 10.1287/moor.1120.0568.
    [4] M. BenaïmJ. Hofbauer and S. Sorin, Stochastic approximations and differential inclusions, part Ⅱ: Applications, Mathematics of Operations Research, 31 (2006), 673-695.  doi: 10.1287/moor.1060.0213.
    [5] L. M. Bregman, The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Computational Mathematics and Mathematical Physics, 7 (1967), 200-217. 
    [6] S. Bubeck and N. Cesa-Bianchi, Regret analysis of stochastic and nonstochastic multi-armed bandit problems, Foundations and trends in machine learning, 5 (2012), 1-122.  doi: 10.1561/2200000024.
    [7] S. Bubeck, Introduction to online optimization, Lecture Notes, 2011.
    [8] N. Cesa-Bianchi, Analysis of two gradient-based algorithms for on-line regression, in COLT '97: Proceedings of the 10th Annual Conference on Computational Learning Theory, 1997, 163–170. doi: 10.1145/267460.267492.
    [9] N. Cesa-BianchiY. FreundD. HausslerD. P. HelmboldR. E. Schapire and M. K. Warmuth, How to use expert advice, Journal of the ACM, 44 (1997), 427-485.  doi: 10.1145/258128.258179.
    [10] N. Cesa-Bianchi and G. Lugosi, Prediction, Learning, and Games, Cambridge University Press, 2006. doi: 10.1017/CBO9780511546921.
    [11] D. Fudenberg and D. K. Levine, Consistency and cautious fictitious play, Journal of Economic Dynamics and Control, 19 (1995), 1065-1089.  doi: 10.1016/0165-1889(94)00819-4.
    [12] D. Fudenberg and D. K. Levine, The Theory of Learning in Games, vol. 2 of Economic learning and social evolution, MIT Press, Cambridge, MA, 1998.
    [13] D. Fudenberg and D. K. Levine, Conditional universal consistency, Games and Economic Behavior, 29 (1999), 104-130.  doi: 10.1006/game.1998.0705.
    [14] J. Hannan, Approximation to Bayes risk in repeated play, in Contributions to the Theory of Games, Volume Ⅲ (eds. M. Dresher, A. W. Tucker and P. Wolfe), vol. 39 of Annals of Mathematics Studies, Princeton University Press, Princeton, NJ, 1957, 97–139.
    [15] E. Hazan, A survey: The convex optimization approach to regret minimization, in Optimization for Machine Learning (eds. S. N. Suvrit Spa and S. J. Wright), MIT Press, 2012, 287–304.
    [16] J. Hofbauer and W. H. Sandholm, On the global convergence of stochastic fictitious play, Econometrica, 70 (2002), 2265-2294.  doi: 10.1111/1468-0262.00376.
    [17] S. M. KakadeS. Shalev-Shwartz and A. Tewari, Regularization techniques for learning with matrices, The Journal of Machine Learning Research, 13 (2012), 1865-1890. 
    [18] J. Kivinen and M. K. Warmuth, Exponentiated gradient versus gradient descent for linear predictors, Information and Computation, 132 (1997), 1-63.  doi: 10.1006/inco.1996.2612.
    [19] N. Littlestone and M. K. Warmuth, The weighted majority algorithm, Information and Computation, 108 (1994), 212-261.  doi: 10.1006/inco.1994.1009.
    [20] S. Mannor and V. Perchet, Approchability, fast and slow, Journal of Machine Learning Research: Workshop and Conference Proceedings, 30 (2013), 1-16. 
    [21] A. Mas-Colell, M. D. Whinston and J. R. Green, Microeconomic Theory, Oxford University Press, New York, NY, USA, 1995.
    [22] A. S. NemirovskiA. JuditskyG. G. Lan and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM Journal on Optimization, 19 (2009), 1574-1609.  doi: 10.1137/070704277.
    [23] A. S. Nemirovski and D. B. Yudin, Problem Complexity and Method Efficiency in Optimization, Wiley, New York, NY, 1983.
    [24] Y. Nesterov, Primal-dual subgradient methods for convex problems, Mathematical Programming, 120 (2009), 221-259.  doi: 10.1007/s10107-007-0149-x.
    [25] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970.
    [26] R. T. Rockafellar and R. J. B. Wets, Variational Analysis, vol. 317 of A Series of Comprehensive Studies in Mathematics, Springer-Verlag, Berlin, 1998. doi: 10.1007/978-3-642-02431-3.
    [27] S. Shalev-Shwartz, Online Learning: Theory, Algorithms, and Applications, PhD thesis, Hebrew University of Jerusalem, 2007. doi: 10.1017/CBO9781107298019.022.
    [28] S. Shalev-Shwartz, Online learning and online convex optimization, Foundations and Trends in Machine Learning, 4 (2011), 107-194.  doi: 10.1561/2200000018.
    [29] S. Sorin, Exponential weight algorithm in continuous time, Mathematical Programming, 116 (2009), 513-528.  doi: 10.1007/s10107-007-0111-y.
    [30] V. G. Vovk, Aggregating strategies, in COLT '90: Proceedings of the Third Workshop on Computational Learning Theory, 1990,371–383. doi: 10.1016/B978-1-55860-146-8.50032-1.
    [31] V. G. Vovk, A game of prediction with expert advice, in COLT '95: Proceedings of the 8th Annual Conference on Computational Learning Theory, 1995, 51–60. doi: 10.1145/225298.225304.
    [32] M. K. Warmuth and A. K. Jagota, Continuous and discrete-time nonlinear gradient descent: Relative loss bounds and convergence, in Electronic Proceedings of the 5th International Symposium on Artificial Intelligence and Mathematics, 1997.
    [33] M. Zinkevich, Online convex programming and generalized infinitesimal gradient ascent, in ICML '03: Proceedings of the 20th International Conference on Machine Learning, 2003.
  • 加载中

Figures(2)

Tables(1)

SHARE

Article Metrics

HTML views(1825) PDF downloads(111) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return