Home > database >  how to count unique values after groupby ID
how to count unique values after groupby ID

Time:02-06

i have the following df

ID  from   to
A    0x    0c
A    0x    0f
A    0f    0n
B    0f    0c
B    0c    0f
C    0k    0j
C    0j    0k
C    0k    0a

First i want to groupby id and only keep groups if the number of unique values from from and to combined is less than 3.

so the desired df will be

B    0f    0c
B    0c    0f
C    0k    0j
C    0j    0k
C    0k    0a

CodePudding user response:

What about using a groupby filter with a lambda function that confirms the number of unique values in the from and to columns is less than or equal to 3? You can use DataFrame.stack() as a hacky solution to put all of the values in a dataframe into a single Series to use Series.nunique() with:

import pandas as pd

# Your dataframe here
df = pd.read_clipboard()

out = df.groupby("ID").filter(lambda x: x[["from", "to"]].stack().nunique() <= 3)

out:

  ID from  to
3  B   0f  0c
4  B   0c  0f
5  C   0k  0j
6  C   0j  0k
7  C   0k  0a
  • Related