×

Fully stochastic trust-region sequential quadratic programming for equality-constrained optimization problems. (English) Zbl 07870814

Summary: We propose a trust-region stochastic sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with stochastic objectives and deterministic equality constraints. We consider a fully stochastic setting, where at each step a single sample is generated to estimate the objective gradient. The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices (i.e., Hessians without modification) in SQP subproblems. As a trust-region method for constrained optimization, our algorithm must address an infeasibility issue – the linearized equality constraints and trust-region constraints may lead to infeasible SQP subproblems. In this regard, we propose an adaptive relaxation technique to compute the trial step, consisting of a normal step and a tangential step. To control the lengths of these two steps while ensuring a scale-invariant property, we adaptively decompose the trust-region radius into two segments, based on the proportions of the rescaled feasibility and optimality residuals to the rescaled full KKT residual. The normal step has a closed form, while the tangential step is obtained by solving a trust-region subproblem, to which a solution ensuring the Cauchy reduction is sufficient for our study. We establish a global almost sure convergence guarantee for TR-StoSQP and illustrate its empirical performance on both a subset of problems in the CUTEst test set and constrained logistic regression problems using data from the LIBSVM collection.

MSC:

90C15 Stochastic programming
90C30 Nonlinear programming
90C55 Methods of successive quadratic programming type
90C90 Applications of mathematical programming

Software:

CUTEst; LIBSVM; Ipopt

References:

[1] Berahas, A. S., Bollapragada, R., and Zhou, B., An Adaptive Sampling Sequential Quadratic Programming Method for Equality Constrained Stochastic Optimization, preprint, https://arxiv.org/abs/2206.00712, 2022.
[2] Berahas, A. S., Curtis, F. E., O’Neill, M. J., and Robinson, D. P., A stochastic sequential quadratic optimization algorithm for nonlinear equality constrained optimization with rank-deficient Jacobians, Math. Oper. Res., (2023), .
[3] Berahas, A. S., Curtis, F. E., Robinson, D., and Zhou, B., Sequential quadratic optimization for nonlinear equality constrained stochastic optimization, SIAM J. Optim., 31 (2021), pp. 1352-1379, doi:10.1137/20m1354556. · Zbl 1469.90134
[4] Berahas, A. S., Shi, J., Yi, Z., and Zhou, B., Accelerating stochastic sequential quadratic programming for equality constrained optimization using predictive variance reduction, Comput. Optim. Appl., 86 (2023), pp. 79-116. · Zbl 1532.90058
[5] Bertsekas, D., Network Optimization: Continuous and Discrete Models, Vol. 8, Athena Scientific, Belmont, MA, 1998, http://www.athenasc.com/netbook.html. · Zbl 0997.90505
[6] Bertsekas, D. P., Constrained Optimization and Lagrange Multiplier Methods, Elsevier, Amsterdam, 1982, doi:10.1016/c2013-0-10366-2. · Zbl 0572.90067
[7] Birge, J. R., State-of-the-art-survey—stochastic programming: Computation and applications, INFORMS J. Comput., 9 (1997), pp. 111-133, doi:10.1287/ijoc.9.2.111. · Zbl 0885.90087
[8] Boggs, P. T. and Tolle, J. W., Sequential quadratic programming, Acta Numer., 4 (1995), pp. 1-51, doi:10.1017/s0962492900002518. · Zbl 0828.65060
[9] Bottou, L., Curtis, F. E., and Nocedal, J., Optimization methods for large-scale machine learning, SIAM Rev., 60 (2018), pp. 223-311, doi:10.1137/16m1080173. · Zbl 1397.65085
[10] Byrd, R. H., Schnabel, R. B., and Shultz, G. A., A trust region algorithm for nonlinearly constrained optimization, SIAM J. Numer. Anal., 24 (1987), pp. 1152-1170, doi:10.1137/0724076. · Zbl 0631.65068
[11] Celis, M. R., Dennis, J. Jr., and Tapia, R. A., A Trust Region Strategy for Equality Constrained Optimization, Technical report, Department of Mathematical Sciences, Rice University, Houston, TX, 1984, https://apps.dtic.mil/sti/citations/ADA454933.
[12] Chang, C.-C. and Lin, C.-J., LIBSVM: A library for support vector machines, ACM Trans. Intell. Syst. Technol., 2 (2011), pp. 1-27, doi:10.1145/1961189.1961199.
[13] Chen, C., Tung, F., Vedula, N., and Mori, G., Constraint-aware deep neural network compression, in Computer Vision - ECCV 2018, Springer, Cham, 2018, pp. 409-424, doi:10.1007/978-3-030-01237-3_25.
[14] Chen, R., Menickelly, M., and Scheinberg, K., Stochastic optimization using a trust-region method and random models, Math. Program., 169 (2017), pp. 447-487, doi:10.1007/s10107-017-1141-8. · Zbl 1401.90136
[15] Chen, Y.-L., Na, S., and Kolar, M., Convergence analysis of accelerated stochastic gradient descent under the growth condition, Math. Oper. Res., (2023), .
[16] Curtis, F. E., O’Neill, M. J., and Robinson, D. P., Worst-case complexity of an SQP method for nonlinear equality constrained stochastic optimization, Math. Program., 205 (2024), pp. 431-483. · Zbl 07829609
[17] Curtis, F. E. and Robinson, D. P., Exploiting negative curvature in deterministic and stochastic optimization, Math. Program., 176 (2018), pp. 69-94, doi:10.1007/s10107-018-1335-8. · Zbl 1417.49036
[18] Curtis, F. E., Robinson, D. P., and Zhou, B., Inexact Sequential Quadratic Optimization for Minimizing a Stochastic Objective Function Subject to Deterministic Nonlinear Equality Constraints, preprint, https://arxiv.org/abs/2107.03512, 2021.
[19] Curtis, F. E., Scheinberg, K., and Shi, R., A stochastic trust region algorithm based on careful step normalization, INFORMS J. Optim., 1 (2019), pp. 200-220, doi:10.1287/ijoo.2018.0010.
[20] Curtis, F. E. and Shi, R., A fully stochastic second-order trust region method, Optim. Methods Softw., 37 (2020), pp. 844-877, doi:10.1080/10556788.2020.1852403. · Zbl 1502.90115
[21] Dupacova, J. and Wets, R., Asymptotic behavior of statistical estimators and of optimal solutions of stochastic optimization problems, Ann. Statist., 16 (1988), pp. 1517-1549, doi:10.1214/aos/1176351052. · Zbl 0667.62018
[22] El-Alem, M., A global convergence theory for the Celis-Dennis-Tapia trust-region algorithm for constrained optimization, SIAM J. Numer. Anal., 28 (1991), pp. 266-290, doi:10.1137/0728015. · Zbl 0725.65061
[23] Gould, N. I. M., Orban, D., and Toint, P. L., CUTEst: A constrained and unconstrained testing environment with safe threads for mathematical optimization, Comput. Optim. Appl., 60 (2014), pp. 545-557, doi:10.1007/s10589-014-9687-3. · Zbl 1325.90004
[24] Johnson, R. and Zhang, T., Accelerating stochastic gradient descent using predictive variance reduction, Adv. Neural Inf. Process. Syst., 26 (2013), https://proceedings.neurips.cc/paper/2013/hash/ac1dd209cbcc5e5d1c6e28598e8cbbe8-Abstract.html.
[25] Na, S., Anitescu, M., and Kolar, M., Inequality constrained stochastic nonlinear optimization via active-set sequential quadratic programming, Math. Program., 202 (2023), pp. 279-353. · Zbl 1533.90073
[26] Na, S., Anitescu, M., and Kolar, M., An adaptive stochastic sequential quadratic programming with differentiable exact augmented Lagrangians, Math. Program., 199 (2023), pp. 721-791, doi:10.1007/s10107-022-01846-z. · Zbl 1518.90057
[27] Na, S., Derezinski, M., and Mahoney, M. W., Hessian averaging in stochastic Newton methods achieves superlinear convergence, Math. Program., 201 (2023), pp. 473-520. · Zbl 1532.90077
[28] Na, S. and Mahoney, M. W., Asymptotic Convergence Rate and Statistical Inference for Stochastic Sequential Quadratic Programming, preprint, https://arxiv.org/abs/2205.13687, 2022.
[29] Nocedal, J. and Wright, S., Numerical Optimization, Springer, New York, 2006, doi:10.1007/978-0-387-40065-5. · Zbl 1104.65059
[30] Omojokun, E. O., Trust Region Algorithms for Optimization with Nonlinear Equality and Inequality Constraints, Ph.D. thesis, University of Colorado, Boulder, CO, 1989, doi:10.5555/76519.
[31] Powell, M. J. D. and Yuan, Y., A trust region algorithm for equality constrained optimization, Math. Programming, 49 (1990), pp. 189-211, doi:10.1007/bf01588787. · Zbl 0816.90121
[32] Rees, T., Dollar, H. S., and Wathen, A. J., Optimal solvers for PDE-constrained optimization, SIAM J. Sci. Comput., 32 (2010), pp. 271-298, doi:10.1137/080727154. · Zbl 1208.49035
[33] Robbins, H. and Siegmund, D., A convergence theorem for non negative almost supermartingales and some applications, in Optimizing Methods in Statistics, Elsevier, Amsterdam, 1971, pp. 233-257, doi:10.1016/b978-0-12-604550-5.50015-8. · Zbl 0286.60025
[34] Stich, S. U., Unified Optimal Analysis of the (stochastic) Gradient Method, preprint, https://arxiv.org/abs/1907.04232, 2019.
[35] Sun, S. and Nocedal, J., A trust region method for noisy unconstrained optimization, Math. Program., 202 (2023), pp. 445-472, doi:10.1007/s10107-023-01941-9. · Zbl 1526.65023
[36] Vardi, A., A trust region algorithm for equality constrained minimization: Convergence properties and implementation, SIAM J. Numer. Anal., 22 (1985), pp. 575-591, doi:10.1137/0722035. · Zbl 0581.65045
[37] Vaswani, S., Bach, F., and Schmidt, M., Fast and faster convergence of SGD for over-parameterized models and an accelerated perceptron, in Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, , PMLR, 2019, pp. 1195-1204, https://proceedings.mlr.press/v89/vaswani19a.html.
[38] Wächter, A. and Biegler, L. T., On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming, Math. Program., 106 (2005), pp. 25-57, doi:10.1007/s10107-004-0559-y. · Zbl 1134.90542
[39] Yuan, Y., On a subproblem of trust region algorithms for constrained optimization, Math. Programming, 47 (1990), pp. 53-63, doi:10.1007/bf01580852. · Zbl 0711.90062
[40] Yuan, Y., A dual algorithm for minimizing a quadratic function with two quadratic constraints, J. Comput. Math., 9 (1991), pp. 348-359, http://www.jstor.org/stable/45340300. · Zbl 0758.65050
[41] Zhang, Y., Computing a Celis-Dennis-Tapia trust-region step for equality constrained optimization, Math. Programming, 55 (1992), pp. 109-124, doi:10.1007/bf01581194. · Zbl 0773.90056
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.