arrow
Volume 37, Issue 6
Proximal-Proximal-Gradient Method

Ernest K. Ryu & Wotao Yin

J. Comp. Math., 37 (2019), pp. 778-812.

Published online: 2019-11

Export citation
  • Abstract

In this paper, we present the proximal-proximal-gradient method (PPG), a novel optimization method that is simple to implement and simple to parallelize. PPG generalizes the proximal-gradient method and ADMM and is applicable to minimization problems written as a sum of many differentiable and many non-differentiable convex functions. The non-differentiable functions can be coupled. We furthermore present a related stochastic variation, which we call stochastic PPG (S-PPG). S-PPG can be interpreted as a generalization of Finito and MISO over to the sum of many coupled non-differentiable convex functions.
We present many applications that can benefit from PPG and S-PPG and prove convergence for both methods. We demonstrate the empirical effectiveness of both methods through experiments on a CUDA GPU. A key strength of PPG and S-PPG is, compared to existing methods, their ability to directly handle a large sum of non-differentiable non-separable functions with a constant step size independent of the number of functions. Such non-diminishing step size allows them to be fast.

  • AMS Subject Headings

47N10, 65K05, 90C06, 90C25, 90C30

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address

eryu@math.ucla.edu (Ernest K. Ryu)

wotaoyin@math.ucla.edu (Wotao Yin)

  • BibTex
  • RIS
  • TXT
@Article{JCM-37-778, author = {Ryu , Ernest K. and Yin , Wotao}, title = {Proximal-Proximal-Gradient Method}, journal = {Journal of Computational Mathematics}, year = {2019}, volume = {37}, number = {6}, pages = {778--812}, abstract = {

In this paper, we present the proximal-proximal-gradient method (PPG), a novel optimization method that is simple to implement and simple to parallelize. PPG generalizes the proximal-gradient method and ADMM and is applicable to minimization problems written as a sum of many differentiable and many non-differentiable convex functions. The non-differentiable functions can be coupled. We furthermore present a related stochastic variation, which we call stochastic PPG (S-PPG). S-PPG can be interpreted as a generalization of Finito and MISO over to the sum of many coupled non-differentiable convex functions.
We present many applications that can benefit from PPG and S-PPG and prove convergence for both methods. We demonstrate the empirical effectiveness of both methods through experiments on a CUDA GPU. A key strength of PPG and S-PPG is, compared to existing methods, their ability to directly handle a large sum of non-differentiable non-separable functions with a constant step size independent of the number of functions. Such non-diminishing step size allows them to be fast.

}, issn = {1991-7139}, doi = {https://doi.org/10.4208/jcm.1906-m2018-0282}, url = {http://global-sci.org/intro/article_detail/jcm/13374.html} }
TY - JOUR T1 - Proximal-Proximal-Gradient Method AU - Ryu , Ernest K. AU - Yin , Wotao JO - Journal of Computational Mathematics VL - 6 SP - 778 EP - 812 PY - 2019 DA - 2019/11 SN - 37 DO - http://doi.org/10.4208/jcm.1906-m2018-0282 UR - https://global-sci.org/intro/article_detail/jcm/13374.html KW - Proximal-gradient, ADMM, Finito, MISO, SAGA, Operator splitting, First-order methods, Distributed optimization, Stochastic optimization, Almost sure convergence, linear convergence. AB -

In this paper, we present the proximal-proximal-gradient method (PPG), a novel optimization method that is simple to implement and simple to parallelize. PPG generalizes the proximal-gradient method and ADMM and is applicable to minimization problems written as a sum of many differentiable and many non-differentiable convex functions. The non-differentiable functions can be coupled. We furthermore present a related stochastic variation, which we call stochastic PPG (S-PPG). S-PPG can be interpreted as a generalization of Finito and MISO over to the sum of many coupled non-differentiable convex functions.
We present many applications that can benefit from PPG and S-PPG and prove convergence for both methods. We demonstrate the empirical effectiveness of both methods through experiments on a CUDA GPU. A key strength of PPG and S-PPG is, compared to existing methods, their ability to directly handle a large sum of non-differentiable non-separable functions with a constant step size independent of the number of functions. Such non-diminishing step size allows them to be fast.

Ryu , Ernest K. and Yin , Wotao. (2019). Proximal-Proximal-Gradient Method. Journal of Computational Mathematics. 37 (6). 778-812. doi:10.4208/jcm.1906-m2018-0282
Copy to clipboard
The citation has been copied to your clipboard