Home > Blockchain >  Python Pandas new column creation with .assign multiple conditions and combination with lambda if el
Python Pandas new column creation with .assign multiple conditions and combination with lambda if el

Time:10-07

Suppose I have the following DF

import pandas as pd

data_dic = {
    "a": [0,0,1,2],
    "b": [0,3,4,5],
    "c": [6,7,8,9]
}
df = pd.DataFrame(data_dic)

I want to put multiple logical conditions and create a new column "d", using "if else" lambda and ".assign" command, I tried several forms but can not get any result (always boolean errors or other issues). Can some one point me to the right direction or proper use of similar set of commands?

something like this (supported??)

(df
.assign(d = lambda x: (x['a'] x['b']) if ((x['a']==0) & (x['b']>0) & (x['c']==7)) else 0 )
)

errors: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

expected result:

out: 

   a    b   c   d
0   0   0   6   0
1   0   3   7   3
2   1   4   8   0
3   2   5   9   0

thank you so much for suggestions,

expected commands in general form:

.assign(new_col = lambda x: (if True numeric sum col_a col_b) if (logical conditions using col_a,b,c) else (default result))

I have tried several forms with boolean (bool()) but not sure what is the right syntaxes or if this is the right approach.

CodePudding user response:

using np.where and .assign is vectorized and much faster than apply and lambda

df = df.assign(d=np.where((df["a"].eq(0) & df["b"].gt(0) & df["c"].eq(7)), df["a"]   df["b"], 0))

print(df)

   a  b  c  d
0  0  0  6  0
1  0  3  7  3
2  1  4  8  0
3  2  5  9  0

CodePudding user response:

two things

  1. apply lambda to a df. you need to associate lambda to the DF. hence df.apply(...)
  2. to apply to the row, pass argument axis=1
(df
.assign(d = df.apply(lambda x: (x['a'] x['b']) if ((x['a']==0) & (x['b']>0) & (x['c']==7)) else 0, axis=1) )
)
    a   b   c   d
0   0   0   6   0
1   0   3   7   3
2   1   4   8   0
3   2   5   9   0

alternately,

you don't need to use assign. just assign the result to the new column

df['e'] = (df.apply(lambda x: (x['a'] x['b']) 
                    if ((x['a']==0) & (x['b']>0) & (x['c']==7)) 
                    else 0
                    , axis=1) )

    a   b   c   d
0   0   0   6   0
1   0   3   7   3
2   1   4   8   0
3   2   5   9   0
  • Related