I'm trying to use the optimization module in SciPy to solve constrained optimization problem. I need to implement the 'hess' argument. In scipy's documentation and tutorial, their hessian are simply [[2, 0], [0, 0]]
and [[2, 0], [0, 0]]
. However, my hessian is something like [[(-24)*x[0]**2 48*x[0]-16, 0], [0, 0]]
and [[(-48)*x[0]**2 192*x[0]-176, 0], [0, 0]]
so that I cannot simply use numpy.array to do multiplication. It seems that I should send a LinearOperator object to the 'hess' arguement. Examples of using LinearOperator is unclear in both
my code is:
import numpy as np
from scipy.optimize import Bounds
from scipy.optimize import NonlinearConstraint
from scipy.optimize import minimize
def f(x):
return (-x[0]-x[1])
def grad(x):
return np.array([-1, -1])
def hess(x):
return np.array([[0, 0], [0, 0]])
def cons_f(x):
return [(-2)*x[0]**4 8*x[0]**3 (-8)*x[0]**2 x[1] -2, (-4)*x[0]**4 32*x[0]**3 (-88)*x[0]**2 96*x[0] x[1] -36]
def cons_Jacobian(x):
return [[(-8)*x[0]**3 24*x[0]**2 - 16*x[0], 1], [(-16)*x[0]**3 96*x[0]**2 - 176*x[0] 96, 1]]
def cons_Hessian(x,v):
# TODO
return v[0]*[[(-24)*x[0]**2 48*x[0]-16, 0], [0, 0]] v[1]*[[(-48)*x[0]**2 192*x[0]-176, 0], [0, 0]]
nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 0, jac=cons_Jacobian, hess=cons_Hessian)
bounds = Bounds([0, 0], [3.0, 4.0])
x0 = np.array([0.5, 1])
res = minimize(f, x0, method='trust-constr', jac=grad, hess=hess,
constraints=[nonlinear_constraint],bounds=bounds)
The cons_Hessian(x,v)
is absolutely wrong in my code.
In their example, although hessians are simply[[2, 0], [0, 0]]
and [[2, 0], [0, 0]]
, the usage is confusing. I don't understand where p comes in.
from scipy.sparse.linalg import LinearOperator
def cons_H_linear_operator(x, v):
def matvec(p):
return np.array([p[0]*2*(v[0] v[1]), 0])
return LinearOperator((2, 2), matvec=matvec)
nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 1,
jac=cons_J, hess=cons_H_linear_operator)
CodePudding user response:
There's no need to use a LinearOperator. You only need to ensure that cons_f
, cons_Jacobian
and cons_Hessian
return np.ndarrays. That's the reason why you can't evaluate your cons_Hessian
. Additionally, it's highly recommended to use double literals instead of integers, i.e. -2.0
instead of 2
to prevent that the function returns np.ndarrays with a integer dtype
.
Your example works for me by writing these functions as follows:
def cons_f(x):
con1 = (-2.0)*x[0]**4 8*x[0]**3 (-8)*x[0]**2 x[1] - 2
con2 = (-4)*x[0]**4 32*x[0]**3 (-88)*x[0]**2 96*x[0] x[1] -36
return np.array([con1, con2])
def cons_Jacobian(x):
con1_grad = [(-8.0)*x[0]**3 24*x[0]**2 - 16*x[0], 1]
con2_grad = [(-16)*x[0]**3 96*x[0]**2 - 176*x[0] 96, 1]
return np.array([con1_grad, con2_grad])
def cons_Hessian(x,v):
con1_hess = np.array([[(-24.0)*x[0]**2 48*x[0]-16, 0], [0, 0]])
con2_hess = np.array([[(-48)*x[0]**2 192*x[0]-176, 0], [0, 0]])
return v[0]*con1_hess v[1]*con2_hess