Home > database >  Remove groups where there is not at least two difference values within a column in pandas
Remove groups where there is not at least two difference values within a column in pandas

Time:03-19

I have a dataframe such as

COL1 COL2 SP
G1   A    SP1
G1   A    SP2
G2   B    SP1
G2   B    SP1
G3   C    SP7
G3   C    SP3
G4   A    SP8
G4   A    SP8

And I would like to only keep COL1 COL2 groups where there is at least two different SP names.

I would then get:

COL1 COL2 SP
G1   A    SP1
G1   A    SP2
G3   C    SP7
G3   C    SP3

CodePudding user response:

Let us try transform with nunique

out = df[df.groupby(['COL1','COL2'])['SP'].transform('nunique')>1]
Out[245]: 
  COL1 COL2   SP
0   G1    A  SP1
1   G1    A  SP2
4   G3    C  SP7
5   G3    C  SP3
  • Related