Home > OS >  multiple outputs from applying a function in pandas?
multiple outputs from applying a function in pandas?

Time:05-10

I'm trying to apply a function to a column in a dataframe using one input variable, but I need it to have two output variables. eg:

def func(var1):
    if var1<5:
       return A=3, B=5
    elif var1<10:
       return A=3, B=10
    else:
       return A=7, B=10

is there a way to do this without defining two functions for A & B separately?

Thanks

CodePudding user response:

Use numpy.select with broadcasting masks:

df = pd.DataFrame({'var1':range(3, 15)})

df[['A', 'B']] = np.select([df['var1'].lt(5).to_numpy()[:, None], 
                            df['var1'].lt(10).to_numpy()[:, None]],
                           [[3,5], [3,10]], 
                           default=[7,10])
print (df)
    var1  A   B
0      3  3   5
1      4  3   5
2      5  3  10
3      6  3  10
4      7  3  10
5      8  3  10
6      9  3  10
7     10  7  10
8     11  7  10
9     12  7  10
10    13  7  10
11    14  7  10

Your solution is possible change:

def func(var1):
    if var1<5:
       return (3, 5)
    elif var1<10:
       return (3, 10)
    else:
       return (7, 10)


df[['A','B']] = df['var1'].apply(func).tolist()
print (df)
    var1  A   B
0      3  3   5
1      4  3   5
2      5  3  10
3      6  3  10
4      7  3  10
5      8  3  10
6      9  3  10
7     10  7  10
8     11  7  10
9     12  7  10
10    13  7  10
11    14  7  10

CodePudding user response:

Here is a way using a DataFrame to define the columns to add:

choices = pd.DataFrame([[3,5], [3,10], [7,10]], columns=['A', 'B'])
#    A   B
# 0  3   5
# 1  3  10
# 2  7  10

a = np.select([df['var1'].lt(5), df['var1'].lt(10)], [0, 1], 2)
# array([0, 0, 1, 1, 1, 2])

df.join(choices.iloc[a].set_axis(df.index))

output:

   var1  A   B
0     0  3   5
1     3  3   5
2     5  3  10
3     7  3  10
4     9  3  10
5    11  7  10

used input: df = pd.DataFrame({'var1': [0, 3, 5, 7, 9, 11]})

  • Related