I want to convert the names of items that occur less than two times to None But I don't want some items to be changed.
The original df
| Column A | Column B |
| -------- | -------- |
| Cat | Fish |
| Cat | Bone |
| Camel | Fish |
| Dog | Bone |
| Dog | Bone |
| Tiger | Bone |
I have tried to use this to convert the names
df.loc[df.groupby('Column A').Column A.transform('count').lt(2), 'Column A'] = "None"
| Column A | Column B |
| -------- | -------- |
| Cat | Fish |
| Cat | Bone |
| None | Fish |
| Dog | Bone |
| Dog | Bone |
| None | Bone |
What should I do if I want to keep the "Tiger"?
CodePudding user response:
Use several conditions for your boolean indexing:
# is the count <= 2?
m1 = df.groupby('Column A')['Column A'].transform('count').lt(2)
# is the name NOT Tiger?
m2 = df['Column A'].ne('Tiger')
# if both conditions are True, change to "None"
df.loc[m1&m2, 'Column A'] = "None"