I am trying to implement logistic regression with regularization in Python using optimize.minimize
from the SciPy library. Here is my code:
import pandas as pd
import numpy as np
from scipy import optimize
l = 0.1 # lambda
def sigmoid(z):
return 1 / (1 np.exp(-z))
def cost_function_logit(theta, X, y, l):
h = sigmoid(X @ theta)
# cost
J = -1 / m * (y.T @ np.log(h)
(1 - y).T @ np.log(1 - h)) \
l / (2 * m) * sum(theta[1:] ** 2)
# gradient
a = 1 / m * X.T @ (h - y)
b = l / m * theta
grad = a b
grad[0] = 1 / m * sum(h - y)
return J, grad
data = pd.read_excel('Data.xlsx')
X = data.drop(columns = ['healthy'])
m, n = X.shape
X = X.to_numpy()
X = np.hstack([np.ones([m, 1]), X])
y = pd.DataFrame(data, columns = ['healthy'])
y = y.to_numpy()
initial_theta = np.zeros([n 1, 1])
options = {'maxiter': 400}
res = optimize.minimize(cost_function_logit,
initial_theta,
(X, y, l),
jac = True,
method = 'TNC',
options = options)
An error occurs on the line where I use optimize.minimize
. The last two lines of the error are as follows:
grad = a b
ValueError: operands could not be broadcast together with shapes (17,90) (17,)
I have checked the type and dimensions of X, y and theta, and they seem correct to me.
>>> type(X)
<class 'numpy.ndarray'>
>>> type(y)
<class 'numpy.ndarray'>
>>> type(theta)
<class 'numpy.ndarray'>
>>> X.shape
(90, 17)
>>> y.shape
(90, 1)
>>> theta.shape
(17, 1)
The error says a is a (17,90) matrix but based on my calculations it should be a (17,1) vector. Does anyone know where I went wrong?
CodePudding user response:
I found a solution. Apparently, optimize.minimize
didn't like that y and theta had shapes (90,1) and (17,1), respectively. I converted their shape to (90,) and (17,) and the error message went away.
In terms of code, I changed
initial_theta = np.zeros([n 1, 1])
to just this:
initial_theta = np.zeros([n 1])
and I added the following line:
y = np.reshape(y, [m])
Thanks to those who tried to help me.
CodePudding user response:
The elements of a
are 90 dimensional vectors, whereas the elements of b
are numbers. I'm not totally sure what you're trying to do, but if you want to add vectors, they need to have the same shape. If you want to add the thing in b
to each element in a
row-wise you can do
grad = a np.stack((b,) * a.shape[1], axis=-1)
but I'm assuming you just are messing up constructing a
.