i have the following df
ID from to
A 0x 0c
A 0x 0f
A 0f 0n
B 0f 0c
B 0c 0f
C 0k 0j
C 0j 0k
C 0k 0a
First i want to groupby id
and only keep groups if the number of unique values from from
and to
combined is less than 3.
so the desired df will be
B 0f 0c
B 0c 0f
C 0k 0j
C 0j 0k
C 0k 0a
CodePudding user response:
What about using a groupby filter
with a lambda function that confirms the number of unique values in the from
and to
columns is less than or equal to 3? You can use DataFrame.stack()
as a hacky solution to put all of the values in a dataframe into a single Series to use Series.nunique()
with:
import pandas as pd
# Your dataframe here
df = pd.read_clipboard()
out = df.groupby("ID").filter(lambda x: x[["from", "to"]].stack().nunique() <= 3)
out:
ID from to
3 B 0f 0c
4 B 0c 0f
5 C 0k 0j
6 C 0j 0k
7 C 0k 0a