Why am I getting a singular error when I run it as a function, but when it was done piece wise before it worked correctly? Using the same matrices for both. Following along from a video and I am not able to see what I did different when looking though what he did.
#Artificial data for learning purpose provided by Alex
X= [
[148, 24,1385],
[132,25,2031],
[453,11,86],
[158,24,185],
[172,25,201],
[413,11,86],
[38,54,185],
[142,25,431],
[453,31,86]
]
X = np.array(X)
# Add in the bias (default) value of car before calculating features.
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X])
# y values provided by Alex
y = [10000,20000,15000,20050,10000,20000,15000,25000,12000]
XTX = X.T.dot(X)
XTX_inv = np.linalg.inv(XTX)
w_full = XTX_inv.dot(X.T).dot(y)
# bias value
w0 = w_full[0]
# features
w = w_full[1:]
print(w0, w)
#Output: 25844.754055766753, array([ -16.08906468, -199.47254894, -1.22802883])
At this point the code runs as expected. However, this function gives an error:
# Error in function
def train_linear_regression(X, y):
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X])
XTX = X.T.dot(X)
XTX_inv = np.linalg.inv(XTX)
w = XTX_inv.dot(X.T).dot(y)
return w[0], w[1:]
w0, w = train_linear_regression(X, y)
print(w0, w)
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
<ipython-input-48-ed5d7ddc40b1> in <module>
----> 1 train_linear_regression(X,y)
2 frames
<__array_function__ internals> in inv(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
86
87 def _raise_linalgerror_singular(err, flag):
---> 88 raise LinAlgError("Singular matrix")
89
90 def _raise_linalgerror_nonposdef(err, flag):
LinAlgError: Singular matrix
CodePudding user response:
It looks like you are adding/stacking bias (ones
) to X
twice. First with in the normal flow and second time within the function, that is leading to determinant of 0
for the matrix XTX
in the function.
So you need to remove the addition of ones
from one of them.
def train_linear_regression(X, y):
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X]) # stacking ones here again - REMOVE
....
CodePudding user response:
Try below code, there is no error:
X= [
[148, 24,1385],
[132,25,2031],
[453,11,86],
[158,24,185],
[172,25,201],
[413,11,86],
[38,54,185],
[142,25,431],
[453,31,86]
]
X = np.array(X) #maintain this as input
def train_linear_regression(X, y): #this is your code, no change
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X])
XTX = X.T.dot(X)
XTX_inv = np.linalg.inv(XTX)
w = XTX_inv.dot(X.T).dot(y)
return w[0], w[1:]
w0, w = train_linear_regression(X, y)
print(w0, w)
Output:
25844.754055766753 [ -16.08906468 -199.47254894 -1.22802883]
The reason is that you have converted X
with this code:
ones = np.ones(X.shape[0])
X = np.column_stack([ones, X])
So the original X with shape (9, 3):
array([[ 148, 24, 1385],
[ 132, 25, 2031],
[ 453, 11, 86],
[ 158, 24, 185],
[ 172, 25, 201],
[ 413, 11, 86],
[ 38, 54, 185],
[ 142, 25, 431],
[ 453, 31, 86]])
has become this with shape (9, 4):
array([[1.000e 00, 1.480e 02, 2.400e 01, 1.385e 03],
[1.000e 00, 1.320e 02, 2.500e 01, 2.031e 03],
[1.000e 00, 4.530e 02, 1.100e 01, 8.600e 01],
[1.000e 00, 1.580e 02, 2.400e 01, 1.850e 02],
[1.000e 00, 1.720e 02, 2.500e 01, 2.010e 02],
[1.000e 00, 4.130e 02, 1.100e 01, 8.600e 01],
[1.000e 00, 3.800e 01, 5.400e 01, 1.850e 02],
[1.000e 00, 1.420e 02, 2.500e 01, 4.310e 02],
[1.000e 00, 4.530e 02, 3.100e 01, 8.600e 01]])