Skip to main content
Log in

Stopping Rules for Gradient Methods for Non-convex Problems with Additive Noise in Gradient

  • Published:
Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Abstract

We study the gradient method under the assumption that an additively inexact gradient is available for, generally speaking, non-convex problems. The non-convexity of the objective function, as well as the use of an inexactness specified gradient at iterations, can lead to various problems. For example, the trajectory of the gradient method may be far enough away from the starting point. On the other hand, the unbounded removal of the trajectory of the gradient method in the presence of noise can lead to the removal of the trajectory of the method from the desired global solution. The results of investigating the behavior of the trajectory of the gradient method are obtained under the assumption of the inexactness of the gradient and the condition of gradient dominance. It is well known that such a condition is valid for many important non-convex problems. Moreover, it leads to good complexity guarantees for the gradient method. A rule of early stopping of the gradient method is proposed. Firstly, it guarantees achieving an acceptable quality of the exit point of the method in terms of the function. Secondly, the stopping rule ensures a fairly moderate distance of this point from the chosen initial position. In addition to the gradient method with a constant step, its variant with adaptive step size is also investigated in detail, which makes it possible to apply the developed technique in the case of an unknown Lipschitz constant for the gradient. Some computational experiments have been carried out which demonstrate effectiveness of the proposed stopping rule for the investigated gradient methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Belkin, M.: Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation. Acta Numer. 30, 203–248 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  2. d’Aspremont, A.: Smooth optimization with approximate gradient. SIAM J. Optim. 19(3), 1171–1183 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  3. Devolder, O.: Exactness, inexactness and stochasticity in first-order methods for large-scale convex optimization. Ph.D. thesis, ICTEAM and CORE, Université Catholique de Louvain (2013)

  4. Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146(1), 37–75 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  5. Emelin, I.V., Krasnosel’skii, M.A.: The stoppage rule in iterative procedures of solving ill-posed problems. Autom. Remote Control 39, 1783–1787,: Translation from Avtom. Telemekh. 1978(12), 59–63 (1979). (in Russian)

  6. Frei, S., Gu, Q.: Proxy convexity: a unified framework for the analysis of neural networks trained by gradient descent (2021). arXiv preprint arXiv:2106.13792

  7. Garrigos, G., Gower, R.M.: Handbook of convergence theorems for (stochastic) gradient methods (2023). arXiv preprint arXiv:2301.11235

  8. Gasnikov, A.V.: Modern numerical optimization methods: The universal gradient descent method (2021). arXiv preprint arXiv:1711.00394

  9. Kabanikhin, S.I.: Inverse and Ill-posed Problems. In: Inverse and Ill-posed Problems. deGruyter (2011)

  10. Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the polyak-łojasiewicz condition. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 795–811. Springer (2016)

  11. Nesterov, Y.: Universal gradient methods for convex optimization problems. Math. Program. 152(1), 381–404 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  12. Nesterov, Y., Polyak, B.T.: Cubic regularization of Newton method and its global performance. Math. Program. 108(1), 177–205 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  13. Nesterov, Y., Skokov, V.: On the issue of testing unconstrained optimization algorithms. In: Numerical methods of mathematical programming, pp. 77–91. Moscow (1980) (in Russian)

  14. Polyak, B., Tremba, A.: New versions of Newton method: step-size choice, convergence domain and under-determined equations. Optim. Methods Softw. 35(6), 1272–1303 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  15. Polyak, B.T.: Gradient methods for minimizing functionals. Comput. Math. Math. Phys. 3(4), 864–878 (1963). (in Russian)

    Article  MathSciNet  Google Scholar 

  16. Polyak, B.T.: Introduction to optimization. Optim. Softw. Inc. N. Y. 1, 32 (1987)

    Google Scholar 

  17. Polyak, B.T., Kuruzov, I.A., Stonyakin, F.S.: Stopping rules for gradient methods for non-convex problems with additive noise in gradient (2022) . arXiv preprint arXiv:2205.07544

  18. Vasilyev, F.: Optimization Methods. FP, Moscow (2002). (in Russian)

    Google Scholar 

  19. Vasin, A., Gasnikov, A., Spokoiny, V.: Stopping rules for accelerated gradient methods with additive noise in gradient (2021). arXiv preprint arXiv:2102.02921

  20. Vorontsova, E., Hildbrand, R., Gasnikov, A., Stonyakin, F.: Convex Optimization (2021). arXiv preprint arXiv:2106.01946

Download references

Acknowledgements

The research by F. Stonyakin in Sects. 2 and 3.1 was supported by the strategic academic leadership program "Priority 2030" (Agreement 075-02-2021-1316, 30.09.2021). The research by B. Polyak in Sect. 1 and by I. Kuruzov in Sect. 4 was supported by the Russian Science Foundation (project No. 21-71-30005), https://rscf.ru/en/project/21-71-30005/.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fedor Stonyakin.

Additional information

Communicated by Paulo J.S. Silva.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Stonyakin, F., Kuruzov, I. & Polyak, B. Stopping Rules for Gradient Methods for Non-convex Problems with Additive Noise in Gradient. J Optim Theory Appl 198, 531–551 (2023). https://doi.org/10.1007/s10957-023-02245-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10957-023-02245-w

Keywords

Mathematics Subject Classification

Navigation