-
Notifications
You must be signed in to change notification settings - Fork 110
ENH: Add three solvers for differences of convex functions. #1307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f12ff7e
4d2787e
f2bb89a
02431f4
35ee2be
d922eea
9622d78
8b42d4b
c6311bf
c438f71
ba6381c
1c08d97
74eb7f7
64729d1
e391d00
db63fc9
3154baf
0d273aa
a513199
a6bc344
7da61d2
8bf02cb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,268 @@ | ||
# Copyright 2014-2018 The ODL contributors | ||
# | ||
# This file is part of ODL. | ||
# | ||
# This Source Code Form is subject to the terms of the Mozilla Public License, | ||
# v. 2.0. If a copy of the MPL was not distributed with this file, You can | ||
# obtain one at https://mozilla.org/MPL/2.0/. | ||
|
||
"""Solvers for the optimization of the difference of convex functions. | ||
|
||
Collection of DCA (d.c. algorithms) and related methods which make use of | ||
structured optimization if the objective function can be written as a | ||
difference of two convex functions. | ||
""" | ||
|
||
from __future__ import print_function, division, absolute_import | ||
|
||
__all__ = ('dca', 'prox_dca', 'doubleprox_dc') | ||
|
||
|
||
def dca(x, f, g, niter, callback=None): | ||
r"""Subgradient DCA of Tao and An. | ||
|
||
This algorithm solves a problem of the form :: | ||
|
||
min_x f(x) - g(x), | ||
|
||
where ``f`` and ``g`` are proper, convex and lower semicontinuous | ||
functions. | ||
|
||
Parameters | ||
---------- | ||
x : `LinearSpaceElement` | ||
Initial point, updated in-place. | ||
f : `Functional` | ||
Convex functional. Needs to implement ``f.convex_conj.gradient``. | ||
g : `Functional` | ||
Convex functional. Needs to implement ``g.gradient``. | ||
niter : int | ||
Number of iterations. | ||
callback : callable, optional | ||
Function called with the current iterate after each iteration. | ||
|
||
Notes | ||
----- | ||
The algorithm is described in Section 3 and in particular in Theorem 3 of | ||
`[TA1997] <http://journals.math.ac.vn/acta/pdf/9701289.pdf>`_. The problem | ||
|
||
.. math:: | ||
\min f(x) - g(x) | ||
|
||
has the first-order optimality condition :math:`0 \in \partial f(x) - | ||
\partial g(x)`, i.e., aims at finding an :math:`x` so that there exists a | ||
common element | ||
|
||
.. math:: | ||
y \in \partial f(x) \cap \partial g(x). | ||
|
||
The element :math:`y` can be seen as a solution of the Toland dual problem | ||
|
||
.. math:: | ||
\min g^*(y) - f^*(y) | ||
|
||
and the iteration is given by | ||
|
||
.. math:: | ||
y_n \in \partial g(x_n), \qquad x_{n+1} \in \partial f^*(y_n), | ||
|
||
for :math:`n\geq 0`. Here, a subgradient is found by evaluating the | ||
gradient method of the respective functionals. | ||
|
||
References | ||
---------- | ||
[TA1997] Tao, P D, and An, L T H. *Convex analysis approach to d.c. | ||
programming: Theory, algorithms and applications*. Acta Mathematica | ||
Vietnamica, 22.1 (1997), pp 289--355. | ||
|
||
See also | ||
-------- | ||
prox_dca : | ||
Solver with a proximal step for ``f`` and a subgradient step for ``g``. | ||
doubleprox_dc : | ||
Solver with proximal steps for all the nonsmooth convex functionals | ||
and a gradient step for a smooth functional. | ||
""" | ||
space = f.domain | ||
if g.domain != space: | ||
raise ValueError('`f.domain` and `g.domain` need to be equal, but ' | ||
'{} != {}'.format(space, g.domain)) | ||
f_convex_conj = f.convex_conj | ||
for _ in range(niter): | ||
f_convex_conj.gradient(g.gradient(x), out=x) | ||
|
||
if callback is not None: | ||
callback(x) | ||
|
||
|
||
def prox_dca(x, f, g, niter, gamma, callback=None): | ||
r"""Proximal DCA of Sun, Sampaio and Candido. | ||
|
||
This algorithm solves a problem of the form :: | ||
|
||
min_x f(x) - g(x) | ||
|
||
where ``f`` and ``g`` are two proper, convex and lower semicontinuous | ||
functions. | ||
|
||
Parameters | ||
---------- | ||
x : `LinearSpaceElement` | ||
Initial point, updated in-place. | ||
f : `Functional` | ||
Convex functional. Needs to implement ``f.proximal``. | ||
g : `Functional` | ||
Convex functional. Needs to implement ``g.gradient``. | ||
niter : int | ||
Number of iterations. | ||
gamma : positive float | ||
Stepsize in the primal updates. | ||
callback : callable, optional | ||
Function called with the current iterate after each iteration. | ||
|
||
Notes | ||
----- | ||
The algorithm was proposed as Algorithm 2.3 in | ||
`[SSC2003] | ||
<http://www.global-sci.org/jcm/readabs.php?vol=21&no=4&page=451&year=2003&ppage=462>`_. | ||
It solves the problem | ||
|
||
.. math :: | ||
\min f(x) - g(x) | ||
|
||
by using subgradients of :math:`g` and proximal points of :math:`f`. | ||
The iteration is given by | ||
|
||
.. math :: | ||
y_n \in \partial g(x_n), \qquad x_{n+1} | ||
= \mathrm{Prox}_{\gamma f}(x_n + \gamma y_n). | ||
|
||
In contrast to `dca`, `prox_dca` uses proximal steps with respect to the | ||
convex part ``f``. Both algorithms use subgradients of the concave part | ||
``g``. | ||
|
||
References | ||
---------- | ||
[SSC2003] Sun, W, Sampaio R J B, and Candido M A B. *Proximal point | ||
algorithm for minimization of DC function*. Journal of Computational | ||
Mathematics, 21.4 (2003), pp 451--462. | ||
|
||
See also | ||
-------- | ||
dca : | ||
Solver with subgradinet steps for all the functionals. | ||
doubleprox_dc : | ||
Solver with proximal steps for all the nonsmooth convex functionals | ||
and a gradient step for a smooth functional. | ||
""" | ||
space = f.domain | ||
if g.domain != space: | ||
raise ValueError('`f.domain` and `g.domain` need to be equal, but ' | ||
'{} != {}'.format(space, g.domain)) | ||
for _ in range(niter): | ||
f.proximal(gamma)(x.lincomb(1, x, gamma, g.gradient(x)), out=x) | ||
|
||
if callback is not None: | ||
callback(x) | ||
|
||
|
||
def doubleprox_dc(x, y, f, phi, g, K, niter, gamma, mu, callback=None): | ||
r"""Double-proxmial gradient d.c. algorithm of Banert and Bot. | ||
|
||
This algorithm solves a problem of the form :: | ||
|
||
min_x f(x) + phi(x) - g(Kx). | ||
|
||
Parameters | ||
---------- | ||
x : `LinearSpaceElement` | ||
Initial primal guess, updated in-place. | ||
y : `LinearSpaceElement` | ||
Initial dual guess, updated in-place. | ||
f : `Functional` | ||
Convex functional. Needs to implement ``g.proximal``. | ||
phi : `Functional` | ||
Convex functional. Needs to implement ``phi.gradient``. | ||
Convergence can be guaranteed if the gradient is Lipschitz continuous. | ||
g : `Functional` | ||
Convex functional. Needs to implement ``h.convex_conj.proximal``. | ||
K : `Operator` | ||
Linear operator. Needs to implement ``K.adjoint`` | ||
niter : int | ||
Number of iterations. | ||
gamma : positive float | ||
Stepsize in the primal updates. | ||
mu : positive float | ||
Stepsize in the dual updates. | ||
callback : callable, optional | ||
Function called with the current iterate after each iteration. | ||
|
||
Notes | ||
----- | ||
This algorithm is proposed in `[BB2016] | ||
<https://arxiv.org/abs/1610.06538>`_ and solves the d.c. problem | ||
|
||
.. math :: | ||
\min_x f(x) + \varphi(x) - g(Kx) | ||
|
||
together with its Toland dual | ||
|
||
.. math :: | ||
\min_y g^*(y) - (f + \varphi)^*(K^* y). | ||
|
||
The iterations are given by | ||
|
||
.. math :: | ||
x_{n+1} &= \mathrm{Prox}_{\gamma f} (x_n + \gamma (K^* y_n | ||
- \nabla \varphi(x_n))), \\ | ||
y_{n+1} &= \mathrm{Prox}_{\mu g^*} (y_n + \mu K x_{n+1}). | ||
|
||
To guarantee convergence, the parameter :math:`\gamma` must satisfy | ||
:math:`0 < \gamma < 2/L` where :math:`L` is the Lipschitz constant of | ||
:math:`\nabla \varphi`. | ||
|
||
References | ||
---------- | ||
[BB2016] Banert, S, and Bot, R I. *A general double-proximal gradient | ||
algorithm for d.c. programming*. arXiv:1610.06538 [math.OC] (2016). | ||
|
||
See also | ||
-------- | ||
dca : | ||
Solver with subgradient steps for all the functionals. | ||
prox_dca : | ||
Solver with a proximal step for ``f`` and a subgradient step for ``g``. | ||
""" | ||
primal_space = f.domain | ||
dual_space = g.domain | ||
|
||
if phi.domain != primal_space: | ||
raise ValueError('`f.domain` and `phi.domain` need to be equal, but ' | ||
'{} != {}'.format(primal_space, phi.domain)) | ||
if K.domain != primal_space: | ||
raise ValueError('`f.domain` and `K.domain` need to be equal, but ' | ||
'{} != {}'.format(primal_space, K.domain)) | ||
if K.range != dual_space: | ||
raise ValueError('`g.domain` and `K.range` need to be equal, but ' | ||
'{} != {}'.format(dual_space, K.range)) | ||
|
||
g_convex_conj = g.convex_conj | ||
for _ in range(niter): | ||
f.proximal(gamma)(x.lincomb(1, x, | ||
gamma, K.adjoint(y) - phi.gradient(x)), | ||
out=x) | ||
g_convex_conj.proximal(mu)(y.lincomb(1, y, mu, K(x)), out=y) | ||
|
||
if callback is not None: | ||
callback(x) | ||
|
||
|
||
def doubleprox_dc_simple(x, y, f, phi, g, K, niter, gamma, mu): | ||
"""Non-optimized version of ``doubleprox_dc``. | ||
This function is intended for debugging. It makes a lot of copies and | ||
performs no error checking. | ||
""" | ||
for _ in range(niter): | ||
f.proximal(gamma)(x + gamma * K.adjoint(y) - | ||
gamma * phi.gradient(x), out=x) | ||
g.convex_conj.proximal(mu)(y + mu * K(x), out=y) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
# coding=utf-8 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we usually have this in the top of the files? @kohr-h There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, but without it, the ∂ symbol in the comment would cause a Python 2 error, see #1307 (comment). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In less than two years we can remove it again :-) |
||
|
||
# Copyright 2014-2018 The ODL contributors | ||
# | ||
# This file is part of ODL | ||
# | ||
# This Source Code Form is subject to the terms of the Mozilla Public License, | ||
# v. 2.0. If a copy of the MPL was not distributed with this file, You can | ||
# obtain one at https://mozilla.org/MPL/2.0/. | ||
|
||
"""Tests for the d.c. solvers.""" | ||
|
||
from __future__ import division | ||
import odl | ||
from odl.solvers import dca, prox_dca, doubleprox_dc | ||
from odl.solvers.nonsmooth.difference_convex import doubleprox_dc_simple | ||
import numpy as np | ||
import pytest | ||
|
||
|
||
# Places for the accepted error when comparing results | ||
HIGH_ACCURACY = 8 | ||
LOW_ACCURACY = 4 | ||
|
||
|
||
def test_dca(): | ||
"""Test the dca method for the following simple problem: | ||
|
||
Let a > 0, and let b be a real number. Solve | ||
|
||
min_x a/2 (x - b)^2 - |x| | ||
|
||
For finding possible (local) solutions, we consider several cases: | ||
* x > 0 ==> ∂|.|(x) = 1, i.e., a necessary optimality condition is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess we need a unicode marker for this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
0 = a (x - b) - 1 ==> x = b + 1/a. This only happens if b > -1/a. | ||
* x < 0 ==> ∂|.|(x) = -1, i.e., a necessary optimality condition is | ||
0 = a (x - b) + 1 ==> x = b - 1/a. This only happens if b < 1/a. | ||
* x = 0 ==> ∂|.|(x) = [-1, 1], i.e., a necessary optimality condition is | ||
0 = a (x - b) + [-1, 1] ==> b - 1/a <= x = 0 <= b + 1/a. | ||
|
||
To summarize, we might find the following solution sets: | ||
* {b - 1/a} if b < -1/a, | ||
* {-2/a, 0} if b = -1/a, | ||
* {b - 1/a, 0, b + 1/a} if -1/a < b < 1/a, | ||
* {0, 2/a} if b = 1/a, | ||
* {b + 1/a} if b > 1/a. | ||
""" | ||
|
||
# Set the problem parameters | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Short comment on that this puts us in the third case? |
||
a = 0.5 | ||
b = 0.5 | ||
# This means -1/a = -2 < b = 0.5 < 1/a = 2. | ||
space = odl.rn(1) | ||
f = a / 2 * odl.solvers.L2NormSquared(space).translated(b) | ||
g = odl.solvers.L1Norm(space) | ||
niter = 50 | ||
|
||
# Set up some space elements for the solvers to use | ||
x = space.element(-0.5) | ||
x_dca = x.copy() | ||
x_prox_dca = x.copy() | ||
x_doubleprox = x.copy() | ||
x_simpl = x.copy() | ||
|
||
# Some additional parameters for some of the solvers | ||
phi = odl.solvers.ZeroFunctional(space) | ||
y = space.element(3) | ||
y_simpl = y.copy() | ||
gamma = 1 | ||
mu = 1 | ||
K = odl.IdentityOperator(space) | ||
|
||
dca(x_dca, f, g, niter) | ||
prox_dca(x_prox_dca, f, g, niter, gamma) | ||
doubleprox_dc(x_doubleprox, y, f, phi, g, K, niter, gamma, mu) | ||
doubleprox_dc_simple(x_simpl, y_simpl, f, phi, g, K, niter, gamma, mu) | ||
expected = np.asarray([b - 1 / a, 0, b + 1 / a]) | ||
|
||
dist_dca = np.min(np.abs(expected - float(x_dca))) | ||
dist_prox_dca = np.min(np.abs(expected - float(x_prox_dca))) | ||
dist_prox_doubleprox = np.min(np.abs(expected - float(x_doubleprox))) | ||
|
||
# Optimized and simplified versions of doubleprox_dc should give | ||
# the same result. | ||
assert float(x_simpl) == pytest.approx(float(x_doubleprox)) | ||
assert float(y_simpl) == pytest.approx(float(y)) | ||
|
||
# All methods should give approximately one solution of the problem. | ||
# For 50 iterations, the methods have been tested to achieve an absolute | ||
# accuracy of at least 1/10^6. | ||
assert float(dist_dca) == pytest.approx(0, abs=1e-6) | ||
assert float(dist_prox_dca) == pytest.approx(0, abs=1e-6) | ||
assert float(dist_prox_doubleprox) == pytest.approx(0, abs=1e-6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DC