Home > front end >  Fill a column based on levels of first column and condition on second column
Fill a column based on levels of first column and condition on second column


lets day we have a dataframe,df with two columns as given below. Variable A has two levels 1 and 2. Variable B has three levels YES, NO, and OTHER. We want to derive another dataframe, df2 with variable C which takes a value of "1" if there exists atleast one YES for any level in variable A , other wise "0".


A   B
1   YES
1   YES
1   NO
1   YES
1   NO
2   YES
2   YES
2   YES
2   NO
2   YES
3   NO
3   NO
3   NO


A   C
1   1
2   1
3   0

CodePudding user response:

Use groupby:

>>> df['B'].eq('YES').groupby(df['A']).any().astype(int).reset_index(name='C')
   A  C
0  1  1
1  2  1
2  3  0

CodePudding user response:

One option is to convert column B into numbers, using a defaultdict, and after, group by on A to get the max:

from collections import defaultdict
d = defaultdict(int)
d['YES'] = 1
df.assign(B = df.B.map(d)).groupby('A', as_index = False).agg(C=('B', 'max'))

   A  C
0  1  1
1  2  1
2  3  0
  • Related