Forward-backward quasi-Newton methods for nonsmooth optimization problems

Stella, Lorenzo; Themelis, Andreas; Patrinos, Panos

doi:10.1007/s10589-017-9912-y

Forward-backward quasi-Newton methods for nonsmooth optimization problems

Author:

Stella, Lorenzo

Themelis, Andreas ; Patrinos, Panos

Keywords:

SISTA, Science & Technology, Technology, Physical Sciences, Operations Research & Management Science, Mathematics, Applied, Mathematics, Nonsmooth optimization, Forward-backward splitting, Line-search methods, Quasi-Newton, Kurdyka-Lojasiewicz, BFGS METHOD, SUPERLINEAR CONVERGENCE, PROXIMAL ALGORITHM, GLOBAL CONVERGENCE, DESCENT METHODS, NONCONVEX, MINIMIZATION, INEQUALITY, SHRINKAGE, SUM, math.OC, 0102 Applied Mathematics, 0103 Numerical and Computational Mathematics, Operations Research, 4901 Applied mathematics, 4903 Numerical and computational mathematics

Abstract:

© 2017, Springer Science+Business Media New York. The forward–backward splitting method (FBS) for minimizing a nonsmooth composite function can be interpreted as a (variable-metric) gradient method over a continuously differentiable function which we call forward–backward envelope (FBE). This allows to extend algorithms for smooth unconstrained optimization and apply them to nonsmooth (possibly constrained) problems. Since the FBE can be computed by simply evaluating forward–backward steps, the resulting methods rely on a similar black-box oracle as FBS. We propose an algorithmic scheme that enjoys the same global convergence properties of FBS when the problem is convex, or when the objective function possesses the Kurdyka–Łojasiewicz property at its critical points. Moreover, when using quasi-Newton directions the proposed method achieves superlinear convergence provided that usual second-order sufficiency conditions on the FBE hold at the limit point of the generated sequence. Such conditions translate into milder requirements on the original function involving generalized second-order differentiability. We show that BFGS fits our framework and that the limited-memory variant L-BFGS is well suited for large-scale problems, greatly outperforming FBS or its accelerated version in practice, as well as ADMM and other problem-specific solvers. The analysis of superlinear convergence is based on an extension of the Dennis and Moré theorem for the proposed algorithmic scheme.