I am stuck with an issue, and I think it should be straightforward. The problem is that I have a function, that I would like to apply to two columns of my dataframe. But I receive an error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
To show you what I am trying to do:
# Calculate the accuracy
def mape(actual,pred):
if actual == 0:
if pred == 0:
return 0
else:
return 100
else:
return np.mean(np.abs((actual - pred) / actual)) * 100
Then, I try to apply it on two columns (called Actuals_March & Forecast_March).
# This line runs into the ValueError above.
# I removed all NaN values before running this.
df['MAPE_Mar'] = df.apply(lambda x: mape(df.Actuals_March , df.Forecast_March), axis=1)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
#This is an snapshot of my data:
df.Actuals_March df.Forecast_March
0.0 0.0
0.0 0.0
0.0 0.0
4.0 0.0
0.0 0.0
5.0 0.0
20.0 0.0
0.0 0.0
2.0 0.0
13.0 0.0
Hope you can help me. Thanks in advance
CodePudding user response:
Repalce df
to x
for match values of scalars by columns:
df['MAPE_Mar'] = df.apply(lambda x: mape(x.Actuals_March , x.Forecast_March), axis=1)
Vectorized alternative:
m1 = df['Actuals_March'] == 0
m2 = df['Forecast_March'] == 0
s = (np.abs(df['Actuals_March'] - df['Forecast_March']) / df['Actuals_March']) * 100
df['MAPE_Mar1'] = np.select([m1 & m2, ~m1 & m2], [0, 100], s)