Loading [MathJax]/jax/output/HTML-CSS/config.js
Volume 4, Issue 2
Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks

Jing An & Jianfeng Lu

J. Mach. Learn. , 4 (2025), pp. 89-107.

Published online: 2025-06

[An open-access article; the PDF is free to any online user.]

Export citation
  • Abstract

We study the convergence of stochastic gradient descent (SGD) for non-convex objective functions. We establish the local convergence with positive probability under the local Łojasiewicz condition introduced by Chatterjee [arXiv:2203.16462, 2022] and an additional local structural assumption of the loss function landscape. A key component of our proof is to ensure that the whole trajectories of SGD stay inside the local region with a positive probability. We also provide examples of neural networks with finite widths such that our assumptions hold.

  • AMS Subject Headings

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address
  • BibTex
  • RIS
  • TXT
@Article{JML-4-89, author = {An , Jing and Lu , Jianfeng}, title = {Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks}, journal = {Journal of Machine Learning}, year = {2025}, volume = {4}, number = {2}, pages = {89--107}, abstract = {

We study the convergence of stochastic gradient descent (SGD) for non-convex objective functions. We establish the local convergence with positive probability under the local Łojasiewicz condition introduced by Chatterjee [arXiv:2203.16462, 2022] and an additional local structural assumption of the loss function landscape. A key component of our proof is to ensure that the whole trajectories of SGD stay inside the local region with a positive probability. We also provide examples of neural networks with finite widths such that our assumptions hold.

}, issn = {2790-2048}, doi = {https://doi.org/10.4208/jml.240724}, url = {http://global-sci.org/intro/article_detail/jml/24143.html} }
TY - JOUR T1 - Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks AU - An , Jing AU - Lu , Jianfeng JO - Journal of Machine Learning VL - 2 SP - 89 EP - 107 PY - 2025 DA - 2025/06 SN - 4 DO - http://doi.org/10.4208/jml.240724 UR - https://global-sci.org/intro/article_detail/jml/24143.html KW - Non-convex optimization, Stochastic gradient descent, Convergence analysis. AB -

We study the convergence of stochastic gradient descent (SGD) for non-convex objective functions. We establish the local convergence with positive probability under the local Łojasiewicz condition introduced by Chatterjee [arXiv:2203.16462, 2022] and an additional local structural assumption of the loss function landscape. A key component of our proof is to ensure that the whole trajectories of SGD stay inside the local region with a positive probability. We also provide examples of neural networks with finite widths such that our assumptions hold.

An , Jing and Lu , Jianfeng. (2025). Convergence of Stochastic Gradient Descent under a Local Łojasiewicz Condition for Deep Neural Networks. Journal of Machine Learning. 4 (2). 89-107. doi:10.4208/jml.240724
Copy to clipboard
The citation has been copied to your clipboard