Home > OS >  How do I skip some value when I'm using pandas df.transform
How do I skip some value when I'm using pandas df.transform

Time:11-28

I want to convert the names of items that occur less than two times to None But I don't want some items to be changed.

The original df

| Column A | Column B |
| -------- | -------- |
| Cat      | Fish     |
| Cat      | Bone     |
| Camel    | Fish     |
| Dog      | Bone     |
| Dog      | Bone     |
| Tiger    | Bone     |

I have tried to use this to convert the names

df.loc[df.groupby('Column A').Column A.transform('count').lt(2), 'Column A'] = "None"
| Column A | Column B |
| -------- | -------- |
| Cat      | Fish     |
| Cat      | Bone     |
| None     | Fish     |
| Dog      | Bone     |
| Dog      | Bone     |
| None     | Bone     |

What should I do if I want to keep the "Tiger"?

CodePudding user response:

Use several conditions for your boolean indexing:

# is the count <= 2?
m1 = df.groupby('Column A')['Column A'].transform('count').lt(2)
# is the name NOT Tiger?
m2 = df['Column A'].ne('Tiger')

# if both conditions are True, change to "None"
df.loc[m1&m2, 'Column A'] = "None"
  • Related