I'm trying to compute a new column in a pandas dataframe, based upon others columns, and a function I created. Instead of using a for loop, I prefer to apply the function with entires dataframe columns.
My code is like this :
df['po'] = vect.func1(df['gra'],
Se,
df['p_a'],
df['t'],
Tc)
where df['gra'], df['p_a'], and df['t'] are my dataframe columns (parameters), and Se and Tc are others (real) parameters. df['po'] is my new column.
func1 is a function described in my vect package. This function is :
def func1(g, surf_e, Pa, t, Tco):
if (t <= Tco):
pos = (g-(Pa*surf_e*g))
else:
pos = 0.0
return(pos)
When implemented this way, I obtain an error message, which concern the line : if (t <= Tco):
The error is : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I read the pandas documentation, but didn't find the solution. Can anybody explain me what is the problem ?
I tried to use apply :
for example :
df['po'] = df['gra'].apply(vect.func1)
but I don't know how to use apply with multiples columns as parameters.
Thank you by advance.
CodePudding user response:
Use np.where
with the required condition, value when the condition is True and the default value.
df['po'] = np.where(
df['t'] <= Tc, # Condition
df['gra'] - (df['P_a'] * Se * df['gra']), # Value if True
0 # Value if False
)
EDIT:
Don't forget to import numpy as np
Also, you get an error because you are comparing a series to a series and hence obtain a series of boolean values and not an atomic boolean value which if condition needs.