×

Stochastic forward-backward splitting for monotone inclusions. (English) Zbl 06586792

Summary: We propose and analyze the convergence of a novel stochastic algorithm for monotone inclusions that are sum of a maximal monotone operator and a single-valued cocoercive operator. The algorithm we propose is a natural stochastic extension of the classical forward-backward method. We provide a non-asymptotic error analysis in expectation for the strongly monotone case, as well as almost sure convergence under weaker assumptions. For minimization problems, we recover rates matching those obtained by stochastic extensions of the so-called accelerated methods. Stochastic quasi-Fejér’s sequences are a key technical tool to prove almost sure convergence.

MSC:

47H05 Monotone operators and generalizations
90C15 Stochastic programming
65K10 Numerical optimization and variational techniques
90C25 Convex programming

Software:

Pegasos
Full Text: DOI

References:

[1] Minty, G.J.: Monotone (nonlinear) operators in Hilbert space. Duke Math. J. 29, 341-346 (1962) · Zbl 0111.31202 · doi:10.1215/S0012-7094-62-02933-2
[2] Brézis, H.: Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. North-Holland Publishing Co., Amsterdam-London; American Elsevier Publishing Co., Inc., New York (1973) · Zbl 0252.47055
[3] Pascali, D., Sburlan, S.: Nonlinear mappings of monotone type. Martinus Nijhoff Publishers, The Hague; Sijthoff & Noordhoff International Publishers, Alphen aan den Rijn (1978) · Zbl 0423.47021
[4] Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces. Springer, New York (2011) · Zbl 1218.47001 · doi:10.1007/978-1-4419-9467-7
[5] Zeidler, E.: Nonlinear Functional Analysis and its Applications II/B. Springer, New York (1990) · Zbl 0684.47029 · doi:10.1007/978-1-4612-0985-0
[6] Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pacif. J. Math. 33, 209-216 (1970) · Zbl 0199.47101 · doi:10.2140/pjm.1970.33.209
[7] Rockafellar, R.T.: Monotone operators associated with saddle-functions and minimax problems. In: Nonlinear Functional Analysis, Proceedings of Symposia in Pure Mathematics, Vol. XVIII, Part 1, Chicago, Ill., 1968, pp. 241-250. Amer. Math. Soc., Providence, R.I. (1970) · Zbl 1242.62011
[8] Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimation 53(5-6), 475-504 (2004) · Zbl 1153.47305 · doi:10.1080/02331930412331327157
[9] Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964-979 (1979) · Zbl 0426.65050 · doi:10.1137/0716071
[10] Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. Appl. 72(2), 383-390 (1979) · Zbl 0428.47039 · doi:10.1016/0022-247X(79)90234-8
[11] Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183-202 (2009) · Zbl 1175.94009 · doi:10.1137/080716542
[12] Combettes, P.L., Vũ, B.C.: Variable metric quasi-Fejér monotonicity. Nonlinear Anal. 78, 17-31 (2013) · Zbl 1266.65087 · doi:10.1016/j.na.2012.09.008
[13] Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul 4(4), 1168-1200 (2005). (electronic) · Zbl 1179.94031 · doi:10.1137/050626090
[14] Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Paper 2007/76, Catholic University of Louvain (2007)
[15] Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward-backward algorithms. SIAM J. Optim. 23(3), 1607-1633 (2013) · Zbl 1295.90049 · doi:10.1137/110844805
[16] Duchi, J., Singer, Y.: Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 10, 2899-2934 (2009) · Zbl 1235.62151
[17] Atchade, Y.F., Fort, G., Moulines, E.: On stochastic proximal gradient algorithms. arXiv:1402.2365 (2014) · Zbl 1433.90199
[18] Rosasco, L., Villa, S., and Vũ, B.C.: Convergence of stochastic proximal gradient algorithm. arXiv:1403.5074 (2014) · Zbl 1465.90101
[19] Juditsky, A., Nemirovski, A., Tauvel, C.: Solving variational inequalities with stochastic mirror-prox algorithm. Stoch. Syst. 1(1), 17-58 (2011) · Zbl 1291.49006 · doi:10.1214/10-SSY011
[20] Jiang, H., Xu, H.: Stochastic approximation approaches to the stochastic variational inequality problem. IEEE Trans. Automat. Control 53(6), 1462-1475 (2008) · Zbl 1367.90072 · doi:10.1109/TAC.2008.925853
[21] Koshal, J., Nedic, A., Shanbhag, U.V.: Regularized iterative stochastic approximation methods for stochastic variational inequality problems. IEEE Trans. Autom. Contrl. 58(3), 594-609 (2013) · Zbl 1369.49012 · doi:10.1109/TAC.2012.2215413
[22] Polyak, B.: Introduction to Optimization. Optimization Software, New York (1987) · Zbl 0708.90083
[23] Ermol’ev, Y.M.: On the method of generalized stochastic gradients and quasi-Fejér sequences. Cybernetics 5(2), 208-220 (1969) · doi:10.1007/BF01071091
[24] Ermol’ev, Y.M., Tuniev, A.D.: Random Fejér and quasi-Fejér sequences. Theory of Optimal Solutions-Akademiya Nauk Ukrainskoĭ SSR Kiev 2, 76-83 (1968) · Zbl 0281.90053
[25] Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574-1609 (2008) · Zbl 1189.90109 · doi:10.1137/070704277
[26] Combettes, P.L., Pesquet, J.C.: Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25(2), 1221-1248 (2015) · Zbl 1317.65135 · doi:10.1137/140971233
[27] Combettes, P.L., Pesquet, J.C.: Stochastic approximations and perturbations in forward-backward splitting for monotone operators. Pure Appl. Funct. Anal. 1(1), 13-37 (2016) · Zbl 1377.90109
[28] Lin, Q., Chen, X., Peña, J.: A sparsity preserving stochastic gradient methods for sparse regression. Comput. Optim. Appl. 58, 455-482 (2014) · Zbl 1401.62129 · doi:10.1007/s10589-013-9633-9
[29] Moreau, J.J.: Fonctions convexes duales et points proximaux dans un espace hilbertien. C. R. Acad. Sci. Paris 255, 2897-2899 (1962) · Zbl 0118.10502
[30] Ermol’ev, J.M.: The convergence of random quasi-Féjer sequences. Kibernetika (Kiev) 4, 70-71 (1971)
[31] Combettes, P.L.: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53, 475-504 (2004) · Zbl 1153.47305 · doi:10.1080/02331930412331327157
[32] Attouch, H., Czarnecki, M., Peypouquet, J.: Coupling forward-backward with penalty schemes and parallel splitting for constrained variational inequalities. SIAM J. Optim. 21(4), 1251-1274 (2011) · Zbl 1233.37063 · doi:10.1137/110820300
[33] Rosasco, L., Villa, S., Vũ, B.C.: A stochastic inertial forward-backward splitting algorithm for multivariate monotone inclusions. Optimization (2016) · Zbl 06587813
[34] Bertsekas, D.P., Tsitsiklis, J.N.: Gradient convergence in gradient methods with errors. SIAM J. Optim. 10(3), 627-642 (2000). (electronic) · Zbl 1049.90130 · doi:10.1137/S1052623497331063
[35] Bach, F., Moulines, E.: Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Advances in Neural Information Processing Systems, vol. 24 (2011) · Zbl 1295.90049
[36] Barty, K., Roy, J.S., Strugarek, C.: Hilbert-valued perturbed subgradient algorithms. Math. Oper. Res. 32(3), 551-562 (2007) · Zbl 1341.90093 · doi:10.1287/moor.1070.0253
[37] Auslender, A.: Optimisation. Masson, Paris, New York, Barcelona (1976) · Zbl 0326.90057
[38] Lions, J.L., Stampacchia, G.: Variational inequalities. Commun. Pure Appl. Math. 20, 493-519 (1967) · Zbl 0152.34601 · doi:10.1002/cpa.3160200302
[39] Shapiro, A.: Monte Carlo sampling methods. In: Stochastic Programming, Handbooks in Operations Research and Management Science, vol. 10, pp. 353-425. Elsevier Sci. B. V., Amsterdam (2003) · Zbl 1179.94031
[40] Chen, X., Wets, R.J.B., Zhang, Y.: Stochastic variational inequalities:residual minimization smoothing sample average approximations. SIAM J. Optim. 22(2), 649-673 (2012) · Zbl 1263.90098 · doi:10.1137/110825248
[41] Martinet, B.: Regularisation d’inequations variationelles par approximations successives. Revue Francaise d?Informatique et de Recherche Opérationelle 4, 154-159 (1970) · Zbl 0215.21103
[42] Browder, F.E.: Nonlinear monotone operators and convex sets in Banach spaces. Bull. Am. Math. Soc. 71, 780-785 (1965) · Zbl 0138.39902 · doi:10.1090/S0002-9904-1965-11391-X
[43] Devolder, O.: Stochastic first order methods in smooth convex optimization. Tech. rep., Center for Operations Research and econometrics (2011)
[44] Lan, G.: An optimal method for stochastic composite optimization. Math. Prog. 133(1-2, Ser. A), 365-397 (2012) · Zbl 1273.90136 · doi:10.1007/s10107-010-0434-y
[45] Duchi, J., Agarwal, A., Johansson, M., Jordan, M.: Ergodic mirror descent. SIAM J. Optim. 22(4), 1549-1578 (2012) · Zbl 1262.90114 · doi:10.1137/110836043
[46] Kwok, J.T., C. Hu, Pan, W.: Accelerated gradient methods for stochastic optimization and online learning. In: Advances in Neural Information Processing Systems, vol. 22, pp. 781-789 (2009) · Zbl 1153.47305
[47] Xiao, L.: Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 11, 2543-2596 (2010) · Zbl 1242.62011
[48] Bottou, L., Le Cun, Y.: Online learning for very large data sets. Appl. Stoch. Models Bus. 21(2), 137-151 (2005) · Zbl 1091.68063 · doi:10.1002/asmb.538
[49] Shalev-Shwartz, S., Singer, Y., Srebro, N.: Pegasos: primal estimated sub-gradient solver for SVM. In: Proceedings ICML (2007) · Zbl 1211.90239
[50] Shalev-Shwartz, S., Srebro, N.: SVM optimization: inverse dependence on training set size. In: Proceedings ICML (2008) · Zbl 1091.68063
[51] Zhang, T.: Multi-stage convex relaxation for learning with sparse regularization. In: Advances in Neural Information Processing Systems, pp. 1929-1936 (2008) · Zbl 1178.62089
[52] Yousefian, F., Nedić, A., Shanbhag, U.V.: On stochastic gradient and subgradient methods with adaptive steplength sequences. Autom. J. IFAC 48(1), 56-67 (2012) · Zbl 1244.93178 · doi:10.1016/j.automatica.2011.09.043
[53] Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization i: a generic algorithmic framework. SIAM J. Optim. 22(4), 1469-1492 (2012) · Zbl 1301.62077 · doi:10.1137/110848864
[54] Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization ii: shrinking procedures and optimal algorithms. SIAM J. Optim. 23(4), 2061-2089 (2013) · Zbl 1293.62167 · doi:10.1137/110848876
[55] Hazan, E., Kale, S.: Beyond the regret minimization barrier:optimal algorithms for stochastic strongly-convex optimization. J. Mach. Learn. Res. 15, 2489-2512 (2014) · Zbl 1319.90050
[56] Juditsky, A., Nesterov, Y.E.: Deterministic and stochastic primal-dual subgradient methods for minimizing uniformly convex functions. Stoch. Syst. 4(1), 44-80 (2014) · Zbl 1297.90097 · doi:10.1214/10-SSY010
[57] Robbins, H., Siegmund, D.: A convergence theorem for non negative almost supermartingales and some applications. In: Optimizing Methods in Statistics, pp. 233-257. Academic Press, New York (1971) · Zbl 0286.60025
[58] Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unconstrained Systems, vol. 26. Springer, New York (1978) · Zbl 0381.60004 · doi:10.1007/978-1-4684-9352-8
[59] Benveniste, A., Métivier, M., Priouret, P.: Adaptive Algorithms and Stochastic Approximations. Springer, Berlin (1990) · Zbl 0752.93073 · doi:10.1007/978-3-642-75894-2
[60] Chen, X., White, H.: Asymptotic properties of some projection-based robbins-monro procedures in a Hilbert space. Stud. Nonlinear Dyn. Econ. 6, 1-53 (2002) · Zbl 1178.62089
[61] Bennar, A., Monnez, J.M.: Almost sure convergence of a stochastic approximation process in a convex set. Int. J. Appl. Math. 20(5), 713-722 (2007) · Zbl 1130.62086
[62] Monnez, J.M.: Almost sure convergence of stochastic gradient processes with matrix step sizes. Stat. Probab. Lett. 76(5), 531-536 (2006) · Zbl 1087.62092 · doi:10.1016/j.spl.2005.09.014
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.