Python - ValueError: operands could not be broadcast together with shapes (17,90) (17,)


I am trying to implement logistic regression with regularization in Python using optimize.minimize from the SciPy library. Here is my code:

import pandas as pd
import numpy as np
from scipy import optimize

l = 0.1 # lambda

def sigmoid(z):

    return 1 / (1   np.exp(-z))

def cost_function_logit(theta, X, y, l):

    h = sigmoid(X @ theta)

    # cost

    J = -1 / m * (y.T @ np.log(h)
                   (1 - y).T @ np.log(1 - h)) \
                   l / (2 * m) * sum(theta[1:] ** 2)

    # gradient

    a = 1 / m * X.T @ (h - y)
    b = l / m * theta
    grad = a   b
    grad[0] = 1 / m * sum(h - y)

    return J, grad

data = pd.read_excel('Data.xlsx')

X = data.drop(columns = ['healthy'])
m, n = X.shape
X = X.to_numpy()
X = np.hstack([np.ones([m, 1]), X])

y = pd.DataFrame(data, columns = ['healthy'])
y = y.to_numpy()

initial_theta = np.zeros([n   1, 1])

options = {'maxiter': 400}
res = optimize.minimize(cost_function_logit,
                        (X, y, l),
                        jac = True,
                        method = 'TNC',
                        options = options)

An error occurs on the line where I use optimize.minimize. The last two lines of the error are as follows:

grad = a b

ValueError: operands could not be broadcast together with shapes (17,90) (17,)

I have checked the type and dimensions of X, y and theta, and they seem correct to me.

>>> type(X)
<class 'numpy.ndarray'>
>>> type(y)
<class 'numpy.ndarray'>
>>> type(theta)
<class 'numpy.ndarray'>
>>> X.shape
(90, 17)
>>> y.shape
(90, 1)
>>> theta.shape
(17, 1)

The error says a is a (17,90) matrix but based on my calculations it should be a (17,1) vector. Does anyone know where I went wrong?

I found a solution. Apparently, optimize.minimize didn't like that y and theta had shapes (90,1) and (17,1), respectively. I converted their shape to (90,) and (17,) and the error message went away.

In terms of code, I changed

initial_theta = np.zeros([n   1, 1])

to just this:

initial_theta = np.zeros([n   1])

and I added the following line:

y = np.reshape(y, [m])

Thanks to those who tried to help me.

The elements of a are 90 dimensional vectors, whereas the elements of b are numbers. I'm not totally sure what you're trying to do, but if you want to add vectors, they need to have the same shape. If you want to add the thing in b to each element in a row-wise you can do

grad = a   np.stack((b,) * a.shape[1], axis=-1)

but I'm assuming you just are messing up constructing a.

