Home > Software engineering >  How to apply a function with several dataframe columns as arguments?
How to apply a function with several dataframe columns as arguments?

Time:01-02

I'm trying to compute a new column in a pandas dataframe, based upon others columns, and a function I created. Instead of using a for loop, I prefer to apply the function with entires dataframe columns.

My code is like this :

    df['po'] = vect.func1(df['gra'],
                           Se, 
                           df['p_a'], 
                           df['t'], 
                           Tc)

where df['gra'], df['p_a'], and df['t'] are my dataframe columns (parameters), and Se and Tc are others (real) parameters. df['po'] is my new column.

func1 is a function described in my vect package. This function is :

def func1(g, surf_e, Pa, t, Tco):

    if (t <= Tco):
        pos = (g-(Pa*surf_e*g))
    else: 
        pos = 0.0
    return(pos)

When implemented this way, I obtain an error message, which concern the line : if (t <= Tco):

The error is : ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I read the pandas documentation, but didn't find the solution. Can anybody explain me what is the problem ?

I tried to use apply :

for example :

df['po'] = df['gra'].apply(vect.func1)

but I don't know how to use apply with multiples columns as parameters.

Thank you by advance.

CodePudding user response:

Use np.where with the required condition, value when the condition is True and the default value.

df['po'] = np.where(
    df['t'] <= Tc,                               # Condition
    df['gra'] - (df['P_a'] * Se * df['gra']),    # Value if True
    0                                            # Value if False
)

EDIT:

Don't forget to import numpy as np

Also, you get an error because you are comparing a series to a series and hence obtain a series of boolean values and not an atomic boolean value which if condition needs.

  • Related