I have a dataframe and I want to make a new column that is 0 if any one of about 20 conditions are true, and 1 otherwise. My current code is:
conditions=[(lastYear_sale['Low_Balance_Flag'==0])&(lastYear_sale.another_Flag==0),(lastYear_sale['Low_Balance_Flag'==1])&(lastYear_sale.another_Flag==1)]
choices=[1,0]
lastYear_sale['eligible']=np.select(conditions,choices,default=0)
So here is a simplified version of the dataframe I have, that looks a little like:
data = {'ID':['a', 'b', 'c', 'd'],
'Low_Balance_Flag':[1, 0, 1, 0], 'another_Flag':[0,0,1,1]}
dfr = pd.DataFrame(data)
I would like to add a column called eligible that is 0 if low balance flag or another_flag are 1, but if all other columns are 0 then eligible should be 1. I get an error from my attempt that just says keyError: False but I can't see what the error is, thanks for any suggestions! :)
Edit: so the output I'd be looking for in this case would be:
ID Low_Balance another_Flag Eligible
a 1 0 0
b 0 0 1
c 1 1 0
d 0 1 0
CodePudding user response:
So the conditions
is basically what you need. You just need the proper conditions. I assume the conditions you have are 2 condition clauses separated by comma. If you have a data frame lastYear_sale
which I believe is supposed to be dfr
then
conditions=((lastYear_sale['Low_Balance_Flag']==0)&(lastYear_sale.another_Flag==0))|((lastYear_sale['Low_Balance_Flag']==1)&(lastYear_sale.another_Flag==1))
print((~conditions).astype(int))
0 1
1 0
2 0
3 0
dtype: int64
If your conditions are somewhat dynamic and you need to build them within code you can use pandas.DataFrame.query to evaluate a string expression you have built.
Edit: I still assume dfr
is the same as lastYear_sale
. Also, the data in dfr
does not match the data in the expected output.
#Use either of these
conditions = ~((lastYear_sale['Low_Balance_Flag']==1)|(lastYear_sale.another_Flag==1))
conditions = ~lastYear_sale.eval('Low_Balance_Flag==1|another_Flag==1')
dfr['eligible'] = conditions.astype(int)