Home > OS >  Python dataframe if conditions are true column value=1, otherwise 0
Python dataframe if conditions are true column value=1, otherwise 0

Time:10-13

I have a dataframe and I want to make a new column that is 0 if any one of about 20 conditions are true, and 1 otherwise. My current code is:

conditions=[(lastYear_sale['Low_Balance_Flag'==0])&(lastYear_sale.another_Flag==0),(lastYear_sale['Low_Balance_Flag'==1])&(lastYear_sale.another_Flag==1)]
choices=[1,0]
lastYear_sale['eligible']=np.select(conditions,choices,default=0)

So here is a simplified version of the dataframe I have, that looks a little like:

data = {'ID':['a', 'b', 'c', 'd'],
        'Low_Balance_Flag':[1, 0, 1, 0], 'another_Flag':[0,0,1,1]}
dfr = pd.DataFrame(data)

I would like to add a column called eligible that is 0 if low balance flag or another_flag are 1, but if all other columns are 0 then eligible should be 1. I get an error from my attempt that just says keyError: False but I can't see what the error is, thanks for any suggestions! :)

Edit: so the output I'd be looking for in this case would be:

ID    Low_Balance    another_Flag    Eligible 
 a        1               0           0 
 b        0               0           1 
 c        1               1           0 
 d        0               1           0 

CodePudding user response:

So the conditions is basically what you need. You just need the proper conditions. I assume the conditions you have are 2 condition clauses separated by comma. If you have a data frame lastYear_sale which I believe is supposed to be dfr then

conditions=((lastYear_sale['Low_Balance_Flag']==0)&(lastYear_sale.another_Flag==0))|((lastYear_sale['Low_Balance_Flag']==1)&(lastYear_sale.another_Flag==1))

print((~conditions).astype(int))

0    1
1    0
2    0
3    0
dtype: int64

If your conditions are somewhat dynamic and you need to build them within code you can use pandas.DataFrame.query to evaluate a string expression you have built.

Edit: I still assume dfr is the same as lastYear_sale. Also, the data in dfr does not match the data in the expected output.

#Use either of these
conditions = ~((lastYear_sale['Low_Balance_Flag']==1)|(lastYear_sale.another_Flag==1))
conditions = ~lastYear_sale.eval('Low_Balance_Flag==1|another_Flag==1')

 dfr['eligible'] = conditions.astype(int)
  • Related