I have a df:
test2= pd.DataFrame({
'A':[1, 2, 3],
'B':[4, 5, 6],
'C':[7, 8, 9] })
and I have written a simple function as:
def add(a,b,c):
return a b c
Now I am using this function in my df using pandas df.apply method as:
test2.apply(add(test2['A'], test2['B'], test2['C']),axis=1)
It gives me an error saying ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
But when I amend my code as:
test2.apply(lambda df: add(df['A'],df['B'],df['C']),axis=1)
it works perfectly fine, giving me results as I expected as:
0 12
1 15
2 18
My question is, why I need the lambda expression when I have already defined my function beforehand?
CodePudding user response:
It is because apply
expects a function to apply to a DataFrame
or a Series
. If you call add
on the three Series
objects you are passing to it you don't need to call apply
anymore at all. This is because addition is already applicable to Series
objects. However, not every function can be applied this way, which is what apply
is for.
The error you are getting is due to the fact that you are passing a Series
whereas it expects a function.
CodePudding user response:
Change your signature function:
def add(row):
return sum(row)
out = test2.apply(add, axis=1)
print(out)
# Output:
0 12
1 15
2 18
dtype: int64