SciPy constrained optimization bug

Constrained optimization tasks are fundamental, and occur commonly in real-world scenarios. A common example is a physical mixture, whose components must sum to 100% and be non-negative. Constraints can also involve more complex, nonlinear relationships between variables. Surprisingly, there can still be some difficulty finding good software for the task. Historically, a lot of the best software was commercial. This is still the case for certain classes of optimization problems, such as mixed integer nonlinear programming problems. In recent memory, Matlab’s fmincon function was the most often recommended routine for purely continuous inputs, as it supports arbitrary constraints. SciPy, which I have used for many years, has an analogous routine with the minimize(method='trust-constr') method. Early in the year I was running a simulation that used 1000s of calls to minimize(method='trust-constr') with various constraints. I was surprised to find that the simulation errored out relatively often, saying that the bound constraints were violated. I reported the issue here. I implemented a quick fix to suit my needs here, and eventually implemented a more straightforward solution here. It seems like this fix will be merged soon. Without going in to details (you can read the comments on the PRs and look at the diffs, if you like), there are workarounds that can be used if you experience this issue and don’t want to modify the SciPy source code. This issue only occurs when using the built in numerical methods to calculate derivative functions (Jacobian and Hessian) of your optimization function and constraint functions. It is always preferred to use analytical derivative functions when possible, since it requires many additional function evaluations to estimate them numerically, but this isn’t always practical or possible. If you look at the PRs, you will see the issue is in the _differentiable_functions.py classes in the SciPy optimization module, since the methods used for calculating gradients, etc., will not always keep the solutions in the feasible (bound-constrained) domain. The most simple workaround, short of implementing one of the fixes into your local SciPy, is to supply your own general functions for computing the Jacobian and Hessian. In spite of the apparent complexity of the classes in differentiable_functions.py, this can be done very simply. For instance, look at the example case I gave showing the bug:

import numpy as np
import scipy.optimize as opt
of = lambda x: np.exp(x[0])*(4*x[0]**2 + 2*x[1]**2 + 4*x[0]*x[1] + 2*x[1] + 1)
nce = lambda x: x[0]**2 + x[1]
nci = lambda x: x[0]*x[1]
x0 = np.array((0.99, -0.99)) 
# This causes the bug - using a better starting position, [-1, 1], does not
nlcs = [opt.NonlinearConstraint(nci, -10, np.inf), 
        opt.NonlinearConstraint(nce, 1, 1)]
bnds = opt.Bounds(lb = np.array((-1, -1)), ub = np.array((1, 1)), 
                  keep_feasible=True)
tst = opt.minimize(fun =  of, x0 = x0, method = 'trust-constr', bounds = bnds, 
                   constraints = nlcs)

This will error out. However, we can define some general functions for numerically computing Jacobians and Hessians, and specify these functions as the derivative functions to be used. In this case I will use complex step differentiation, based on the implementations by Yi Cao here and here.

def hesscs(x, fun):
    n = x.flatten().shape[0]
    H = np.zeros((n, n))
    h = np.sqrt(np.finfo(float).eps)
    h2 = h*h
    i = np.complex(0, 1)
    for k in range(n):
        x1 = np.cdouble(x)
        x1[k] = x1[k] + (h*i)
        for l in range(k, n):
            x2 = x1.copy()
            x2[l]= x2[l] + h
            u1 = fun(x2)
            x2[l] = x1[l] - h
            u2 = fun(x2)
            H[k,l] = np.imag(u1-u2)/h2/2
            H[l,k] = H[k,l]
    return H

def gradcs(x, fun):
    n = x.flatten().shape[0]
    g = np.zeros((n))
    h = np.sqrt(np.finfo(float).eps)
    i = np.complex(0, 1)
    for k in range(n):
        x1 = np.cdouble(x)
        x1[k] = x1[k] + (h*i)
        v = fun(x1)
        g[k] = np.imag(v)/h
    return g

With these two functions, we can specify we want to use these methods to compute the Jacobian and Hessian of our optimization function and constraints.

jac_fun = lambda x: gradcs(x, of)
hess_fun = lambda x: hesscs(x, of)
nlcs_wj = [opt.NonlinearConstraint(nci, -10, np.inf, 
                                   jac = lambda x: gradcs(x,nci)), 
           opt.NonlinearConstraint(nce, 1, 1, jac = lambda x: gradcs(x, nce))]
tst = opt.minimize(fun = of, x0 = x0, jac = jac_fun, hess = hess_fun, 
                   method = 'trust-constr', bounds = bnds, 
                   constraints = nlcs_wj)

This will run successfully, in that it will not throw an error. It will not give the “correct” answer, since the supplied starting solution is quite bad, but even so, it should not be necessary to throw an error in such a case.