Skip to content

ENH: Add three solvers for differences of convex functions. #1307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jun 11, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
f12ff7e
ENH: Add three solvers for differences of convex functions.
Jan 29, 2018
4d2787e
BUG: Add a unicode marker to difference_convex_test.py for Python 2.
Mar 2, 2018
f2bb89a
DOC: Replace comment in prox_dca by comment on difference to dca.
Mar 2, 2018
02431f4
DOC: Remove reference without citation.
Mar 2, 2018
35ee2be
DOC: Correction regarding subgradients in documentation of the dca me…
Mar 2, 2018
d922eea
DOC: Repair hyperlink by inserting a space.
Mar 2, 2018
9622d78
MAINT: Optimize proximal d.c. methods.
Mar 2, 2018
8b42d4b
DOC: Matter of taste: take out common factor \gamma.
Mar 2, 2018
c6311bf
TST: Add and test a simplified version of the doubleprox d.c. method.
Mar 2, 2018
c438f71
DOC: Correct gradient/proximal error in prox_dca docstring.
Mar 2, 2018
ba6381c
DOC: Adapt wording of stepsizes.
Mar 2, 2018
1c08d97
DOC: Change docstring of prox_dca to raw string.
Mar 21, 2018
74eb7f7
DOC: Add see also sections to d.c. solvers.
May 3, 2018
64729d1
DOC: Improve and unify wording in functional parameters for d.c. solv…
May 3, 2018
e391d00
ENH: Optimize d.c. solvers by initializing conjugates only once.
May 3, 2018
db63fc9
MAINT: Change the order of the arguments in doubleprox_dc
May 3, 2018
3154baf
DOC: Clarify which case the d.c. test is in.
May 3, 2018
0d273aa
TST: Remove unnecessary parentheses and add comment on accuracy.
May 3, 2018
a513199
DOC: Avoid printing colons.
May 3, 2018
a6bc344
MAINT: Change g and h to f and g, respectively.
May 3, 2018
7da61d2
MAINT: Correct typo in variable name.
May 3, 2018
8bf02cb
TST: Fix test of d.c. solvers.
May 7, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions odl/solvers/nonsmooth/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,6 @@

from .alternating_dual_updates import *
__all__ += alternating_dual_updates.__all__

from .difference_convex import *
__all__ += difference_convex.__all__
268 changes: 268 additions & 0 deletions odl/solvers/nonsmooth/difference_convex.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,268 @@
# Copyright 2014-2018 The ODL contributors
#
# This file is part of ODL.
#
# This Source Code Form is subject to the terms of the Mozilla Public License,
# v. 2.0. If a copy of the MPL was not distributed with this file, You can
# obtain one at https://mozilla.org/MPL/2.0/.

"""Solvers for the optimization of the difference of convex functions.

Collection of DCA (d.c. algorithms) and related methods which make use of
structured optimization if the objective function can be written as a
difference of two convex functions.
"""

from __future__ import print_function, division, absolute_import

__all__ = ('dca', 'prox_dca', 'doubleprox_dc')


def dca(x, f, g, niter, callback=None):
r"""Subgradient DCA of Tao and An.

This algorithm solves a problem of the form ::

min_x f(x) - g(x),

where ``f`` and ``g`` are proper, convex and lower semicontinuous
functions.

Parameters
----------
x : `LinearSpaceElement`
Initial point, updated in-place.
f : `Functional`
Convex functional. Needs to implement ``f.convex_conj.gradient``.
g : `Functional`
Convex functional. Needs to implement ``g.gradient``.
niter : int
Number of iterations.
callback : callable, optional
Function called with the current iterate after each iteration.

Notes
-----
The algorithm is described in Section 3 and in particular in Theorem 3 of
`[TA1997] <http://journals.math.ac.vn/acta/pdf/9701289.pdf>`_. The problem

.. math::
\min f(x) - g(x)

has the first-order optimality condition :math:`0 \in \partial f(x) -
\partial g(x)`, i.e., aims at finding an :math:`x` so that there exists a
common element

.. math::
y \in \partial f(x) \cap \partial g(x).

The element :math:`y` can be seen as a solution of the Toland dual problem

.. math::
\min g^*(y) - f^*(y)

and the iteration is given by

.. math::
y_n \in \partial g(x_n), \qquad x_{n+1} \in \partial f^*(y_n),

for :math:`n\geq 0`. Here, a subgradient is found by evaluating the
gradient method of the respective functionals.

References
----------
[TA1997] Tao, P D, and An, L T H. *Convex analysis approach to d.c.
programming: Theory, algorithms and applications*. Acta Mathematica
Vietnamica, 22.1 (1997), pp 289--355.

See also
--------
prox_dca :
Solver with a proximal step for ``f`` and a subgradient step for ``g``.
doubleprox_dc :
Solver with proximal steps for all the nonsmooth convex functionals
and a gradient step for a smooth functional.
"""
space = f.domain
if g.domain != space:
raise ValueError('`f.domain` and `g.domain` need to be equal, but '
'{} != {}'.format(space, g.domain))
f_convex_conj = f.convex_conj
for _ in range(niter):
f_convex_conj.gradient(g.gradient(x), out=x)

if callback is not None:
callback(x)


def prox_dca(x, f, g, niter, gamma, callback=None):
r"""Proximal DCA of Sun, Sampaio and Candido.

This algorithm solves a problem of the form ::

min_x f(x) - g(x)

where ``f`` and ``g`` are two proper, convex and lower semicontinuous
functions.

Parameters
----------
x : `LinearSpaceElement`
Initial point, updated in-place.
f : `Functional`
Convex functional. Needs to implement ``f.proximal``.
g : `Functional`
Convex functional. Needs to implement ``g.gradient``.
niter : int
Number of iterations.
gamma : positive float
Stepsize in the primal updates.
callback : callable, optional
Function called with the current iterate after each iteration.

Notes
-----
The algorithm was proposed as Algorithm 2.3 in
`[SSC2003]
<http://www.global-sci.org/jcm/readabs.php?vol=21&no=4&page=451&year=2003&ppage=462>`_.
It solves the problem

.. math ::
\min f(x) - g(x)

by using subgradients of :math:`g` and proximal points of :math:`f`.
The iteration is given by

.. math ::
y_n \in \partial g(x_n), \qquad x_{n+1}
= \mathrm{Prox}_{\gamma f}(x_n + \gamma y_n).

In contrast to `dca`, `prox_dca` uses proximal steps with respect to the
convex part ``f``. Both algorithms use subgradients of the concave part
``g``.

References
----------
[SSC2003] Sun, W, Sampaio R J B, and Candido M A B. *Proximal point
algorithm for minimization of DC function*. Journal of Computational
Mathematics, 21.4 (2003), pp 451--462.

See also
--------
dca :
Solver with subgradinet steps for all the functionals.
doubleprox_dc :
Solver with proximal steps for all the nonsmooth convex functionals
and a gradient step for a smooth functional.
"""
space = f.domain
if g.domain != space:
raise ValueError('`f.domain` and `g.domain` need to be equal, but '
'{} != {}'.format(space, g.domain))
for _ in range(niter):
f.proximal(gamma)(x.lincomb(1, x, gamma, g.gradient(x)), out=x)

if callback is not None:
callback(x)


def doubleprox_dc(x, y, f, phi, g, K, niter, gamma, mu, callback=None):
r"""Double-proxmial gradient d.c. algorithm of Banert and Bot.

This algorithm solves a problem of the form ::

min_x f(x) + phi(x) - g(Kx).

Parameters
----------
x : `LinearSpaceElement`
Initial primal guess, updated in-place.
y : `LinearSpaceElement`
Initial dual guess, updated in-place.
f : `Functional`
Convex functional. Needs to implement ``g.proximal``.
phi : `Functional`
Convex functional. Needs to implement ``phi.gradient``.
Convergence can be guaranteed if the gradient is Lipschitz continuous.
g : `Functional`
Convex functional. Needs to implement ``h.convex_conj.proximal``.
K : `Operator`
Linear operator. Needs to implement ``K.adjoint``
niter : int
Number of iterations.
gamma : positive float
Stepsize in the primal updates.
mu : positive float
Stepsize in the dual updates.
callback : callable, optional
Function called with the current iterate after each iteration.

Notes
-----
This algorithm is proposed in `[BB2016]
<https://arxiv.org/abs/1610.06538>`_ and solves the d.c. problem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DC


.. math ::
\min_x f(x) + \varphi(x) - g(Kx)

together with its Toland dual

.. math ::
\min_y g^*(y) - (f + \varphi)^*(K^* y).

The iterations are given by

.. math ::
x_{n+1} &= \mathrm{Prox}_{\gamma f} (x_n + \gamma (K^* y_n
- \nabla \varphi(x_n))), \\
y_{n+1} &= \mathrm{Prox}_{\mu g^*} (y_n + \mu K x_{n+1}).

To guarantee convergence, the parameter :math:`\gamma` must satisfy
:math:`0 < \gamma < 2/L` where :math:`L` is the Lipschitz constant of
:math:`\nabla \varphi`.

References
----------
[BB2016] Banert, S, and Bot, R I. *A general double-proximal gradient
algorithm for d.c. programming*. arXiv:1610.06538 [math.OC] (2016).

See also
--------
dca :
Solver with subgradient steps for all the functionals.
prox_dca :
Solver with a proximal step for ``f`` and a subgradient step for ``g``.
"""
primal_space = f.domain
dual_space = g.domain

if phi.domain != primal_space:
raise ValueError('`f.domain` and `phi.domain` need to be equal, but '
'{} != {}'.format(primal_space, phi.domain))
if K.domain != primal_space:
raise ValueError('`f.domain` and `K.domain` need to be equal, but '
'{} != {}'.format(primal_space, K.domain))
if K.range != dual_space:
raise ValueError('`g.domain` and `K.range` need to be equal, but '
'{} != {}'.format(dual_space, K.range))

g_convex_conj = g.convex_conj
for _ in range(niter):
f.proximal(gamma)(x.lincomb(1, x,
gamma, K.adjoint(y) - phi.gradient(x)),
out=x)
g_convex_conj.proximal(mu)(y.lincomb(1, y, mu, K(x)), out=y)

if callback is not None:
callback(x)


def doubleprox_dc_simple(x, y, f, phi, g, K, niter, gamma, mu):
"""Non-optimized version of ``doubleprox_dc``.
This function is intended for debugging. It makes a lot of copies and
performs no error checking.
"""
for _ in range(niter):
f.proximal(gamma)(x + gamma * K.adjoint(y) -
gamma * phi.gradient(x), out=x)
g.convex_conj.proximal(mu)(y + mu * K(x), out=y)
93 changes: 93 additions & 0 deletions odl/test/solvers/nonsmooth/difference_convex_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# coding=utf-8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we usually have this in the top of the files? @kohr-h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but without it, the ∂ symbol in the comment would cause a Python 2 error, see #1307 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In less than two years we can remove it again :-)


# Copyright 2014-2018 The ODL contributors
#
# This file is part of ODL
#
# This Source Code Form is subject to the terms of the Mozilla Public License,
# v. 2.0. If a copy of the MPL was not distributed with this file, You can
# obtain one at https://mozilla.org/MPL/2.0/.

"""Tests for the d.c. solvers."""

from __future__ import division
import odl
from odl.solvers import dca, prox_dca, doubleprox_dc
from odl.solvers.nonsmooth.difference_convex import doubleprox_dc_simple
import numpy as np
import pytest


# Places for the accepted error when comparing results
HIGH_ACCURACY = 8
LOW_ACCURACY = 4


def test_dca():
"""Test the dca method for the following simple problem:

Let a > 0, and let b be a real number. Solve

min_x a/2 (x - b)^2 - |x|

For finding possible (local) solutions, we consider several cases:
* x > 0 ==> ∂|.|(x) = 1, i.e., a necessary optimality condition is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need a unicode marker for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

0 = a (x - b) - 1 ==> x = b + 1/a. This only happens if b > -1/a.
* x < 0 ==> ∂|.|(x) = -1, i.e., a necessary optimality condition is
0 = a (x - b) + 1 ==> x = b - 1/a. This only happens if b < 1/a.
* x = 0 ==> ∂|.|(x) = [-1, 1], i.e., a necessary optimality condition is
0 = a (x - b) + [-1, 1] ==> b - 1/a <= x = 0 <= b + 1/a.

To summarize, we might find the following solution sets:
* {b - 1/a} if b < -1/a,
* {-2/a, 0} if b = -1/a,
* {b - 1/a, 0, b + 1/a} if -1/a < b < 1/a,
* {0, 2/a} if b = 1/a,
* {b + 1/a} if b > 1/a.
"""

# Set the problem parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short comment on that this puts us in the third case?

a = 0.5
b = 0.5
# This means -1/a = -2 < b = 0.5 < 1/a = 2.
space = odl.rn(1)
f = a / 2 * odl.solvers.L2NormSquared(space).translated(b)
g = odl.solvers.L1Norm(space)
niter = 50

# Set up some space elements for the solvers to use
x = space.element(-0.5)
x_dca = x.copy()
x_prox_dca = x.copy()
x_doubleprox = x.copy()
x_simpl = x.copy()

# Some additional parameters for some of the solvers
phi = odl.solvers.ZeroFunctional(space)
y = space.element(3)
y_simpl = y.copy()
gamma = 1
mu = 1
K = odl.IdentityOperator(space)

dca(x_dca, f, g, niter)
prox_dca(x_prox_dca, f, g, niter, gamma)
doubleprox_dc(x_doubleprox, y, f, phi, g, K, niter, gamma, mu)
doubleprox_dc_simple(x_simpl, y_simpl, f, phi, g, K, niter, gamma, mu)
expected = np.asarray([b - 1 / a, 0, b + 1 / a])

dist_dca = np.min(np.abs(expected - float(x_dca)))
dist_prox_dca = np.min(np.abs(expected - float(x_prox_dca)))
dist_prox_doubleprox = np.min(np.abs(expected - float(x_doubleprox)))

# Optimized and simplified versions of doubleprox_dc should give
# the same result.
assert float(x_simpl) == pytest.approx(float(x_doubleprox))
assert float(y_simpl) == pytest.approx(float(y))

# All methods should give approximately one solution of the problem.
# For 50 iterations, the methods have been tested to achieve an absolute
# accuracy of at least 1/10^6.
assert float(dist_dca) == pytest.approx(0, abs=1e-6)
assert float(dist_prox_dca) == pytest.approx(0, abs=1e-6)
assert float(dist_prox_doubleprox) == pytest.approx(0, abs=1e-6)