arrow
Volume 39, Issue 1
Convergence of Backpropagation with Momentum for Network Architectures with Skip Connections

Chirag Agarwal, Joe Klobusicky & Dan Schonfeld

J. Comp. Math., 39 (2021), pp. 147-158.

Published online: 2020-09

Export citation
  • Abstract

We study a class of deep neural networks with architectures that form a directed acyclic graph (DAG). For backpropagation defined by gradient descent with adaptive momentum, we show weights converge for a large class of nonlinear activation functions. The proof generalizes the results of Wu et al. (2008) who showed convergence for a feed-forward network with one hidden layer. For an example of the effectiveness of DAG architectures, we describe an example of compression through an AutoEncoder, and compare against sequential feed-forward networks under several metrics.

  • AMS Subject Headings

68M07, 68T01

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address

chiragagarwall12@gmail.com (Chirag Agarwal)

klobuj@rpi.edu (Joe Klobusicky)

dans@uic.edu (Dan Schonfeld)

  • BibTex
  • RIS
  • TXT
@Article{JCM-39-147, author = {Agarwal , ChiragKlobusicky , Joe and Schonfeld , Dan}, title = {Convergence of Backpropagation with Momentum for Network Architectures with Skip Connections}, journal = {Journal of Computational Mathematics}, year = {2020}, volume = {39}, number = {1}, pages = {147--158}, abstract = {

We study a class of deep neural networks with architectures that form a directed acyclic graph (DAG). For backpropagation defined by gradient descent with adaptive momentum, we show weights converge for a large class of nonlinear activation functions. The proof generalizes the results of Wu et al. (2008) who showed convergence for a feed-forward network with one hidden layer. For an example of the effectiveness of DAG architectures, we describe an example of compression through an AutoEncoder, and compare against sequential feed-forward networks under several metrics.

}, issn = {1991-7139}, doi = {https://doi.org/10.4208/jcm.1912-m2018-0279}, url = {http://global-sci.org/intro/article_detail/jcm/18282.html} }
TY - JOUR T1 - Convergence of Backpropagation with Momentum for Network Architectures with Skip Connections AU - Agarwal , Chirag AU - Klobusicky , Joe AU - Schonfeld , Dan JO - Journal of Computational Mathematics VL - 1 SP - 147 EP - 158 PY - 2020 DA - 2020/09 SN - 39 DO - http://doi.org/10.4208/jcm.1912-m2018-0279 UR - https://global-sci.org/intro/article_detail/jcm/18282.html KW - Backpropagation with momentum, Autoencoders, Directed acyclic graphs. AB -

We study a class of deep neural networks with architectures that form a directed acyclic graph (DAG). For backpropagation defined by gradient descent with adaptive momentum, we show weights converge for a large class of nonlinear activation functions. The proof generalizes the results of Wu et al. (2008) who showed convergence for a feed-forward network with one hidden layer. For an example of the effectiveness of DAG architectures, we describe an example of compression through an AutoEncoder, and compare against sequential feed-forward networks under several metrics.

Agarwal , ChiragKlobusicky , Joe and Schonfeld , Dan. (2020). Convergence of Backpropagation with Momentum for Network Architectures with Skip Connections. Journal of Computational Mathematics. 39 (1). 147-158. doi:10.4208/jcm.1912-m2018-0279
Copy to clipboard
The citation has been copied to your clipboard