Home > Software engineering >  Make a new column based on group by conditionally in Python
Make a new column based on group by conditionally in Python

Time:11-20

I have a dataframe:

id group
x1   A
x1   B
x2   A
x2   A
x3   B

I would like to create a new column new_group with the following conditions: If there are 2 unique group values within in the same id such as group A and B from rows 1 and 2, new_group should have "two" as its value. If there are only 1 unique group values within the same id such as group A from rows 3 and 4, the value for new_group should be that same group A. Otherwise, specify B.

This is what I am looking for:

id group new_group
x1   A     two
x1   B     two
x2   A      A
x2   A      A
x3   B      B

I tried something like this but don't know how to capture all the if-else conditions df.groupby("id")["group"].filter(lambda x: x.nunique() == 2)

CodePudding user response:

Almost there. Change filter to transform and use a condition:

df['new_group'] = df.groupby("id")["group"] \
                    .transform(lambda x: 'two' if (x.nunique() == 2) else x)
print(df)

# Output:
   id group new_group
0  x1     A       two
1  x1     B       two
2  x2     A         A
3  x2     A         A
4  x3     B         B

CodePudding user response:

Please use the inflect library. That way you will convert any integer to word. Code beloow

import inflect

p = inflect.engine()
df['temp']=df.groupby('id')['group'].transform('nunique')

df =df.assign(new_group= np.where(df['temp'].ge(2),df['temp'].apply(lambda x: p.number_to_words(x)),df['group'])).drop(columns=['temp'])





    id group new_group
0  x1     A       two
1  x1     B       two
2  x2     A         A
3  x2     A         A
4  x3     B         B
  • Related