Home > Enterprise >  Pandas df.apply()
Pandas df.apply()

Time:12-05

I have a df:

test2= pd.DataFrame({
            'A':[1, 2, 3],
            'B':[4, 5, 6],
            'C':[7, 8, 9] })

and I have written a simple function as:

def add(a,b,c):
    return a b c

Now I am using this function in my df using pandas df.apply method as:

test2.apply(add(test2['A'], test2['B'], test2['C']),axis=1)

It gives me an error saying ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

But when I amend my code as:

test2.apply(lambda df: add(df['A'],df['B'],df['C']),axis=1)

it works perfectly fine, giving me results as I expected as:

0 12
1 15
2 18

My question is, why I need the lambda expression when I have already defined my function beforehand?

CodePudding user response:

It is because apply expects a function to apply to a DataFrame or a Series. If you call add on the three Series objects you are passing to it you don't need to call apply anymore at all. This is because addition is already applicable to Series objects. However, not every function can be applied this way, which is what apply is for.

The error you are getting is due to the fact that you are passing a Series whereas it expects a function.

CodePudding user response:

Change your signature function:

def add(row):
    return sum(row)

out = test2.apply(add, axis=1)
print(out)

# Output:
0    12
1    15
2    18
dtype: int64
  • Related