Home > Software design >  Python (pandas) - How to group values in one column and then delete or keep that group based on valu
Python (pandas) - How to group values in one column and then delete or keep that group based on valu

Time:04-29

Let's say I have the following pandas dataset:

Column 1 Column 2
dog RE
dog RE FX
cat RE BA
mouse AQ
mouse RE FX
salmon AQ

Essentially what I would like to do is group the values in Column 1 and then either keep or delete them based on the values in Column 2. So for example, I want to delete all values in a group in Column 1 if ANY of the corresponding rows in Column 2 are "RE" or "RE BA". Based on the dataset above, the output would be the following:

Column 1 Column 2
mouse AQ
mouse RE FX
salmon AQ

I am struggling with this because although I understand how to drop rows based on whether they contain a specific value, I don't understand how to drop entire groups based on whether ANY of the rows in that group contain a specific value. Any help would be greatly appreciated!

CodePudding user response:

You can try groupby then filter

out = df.groupby("Column 1").filter(lambda df: ~df['Column 2'].isin(["RE", "RE BA"]).any())
print(out)

  Column 1 Column 2
3    mouse       AQ
4    mouse    RE FX
5   salmon       AQ

CodePudding user response:

bools = df.groupby("Column 1").apply(lambda x: ~x["Column 2"].isin(["RE", "RE BA"]).any())
df[bools.loc[df["Column 1"]].reset_index(drop=True)]
#   Column 1 Column 2
# 3    mouse       AQ
# 4    mouse    RE FX
# 5   salmon       AQ
  • Related