I have a Dataframe,
df:
ID col1 col2
A 11 0
A 14 0
B 15 0
B 95 1
B 81 2
c 0 1
c 9 1
I want to drop the last row of group of 'ID' column if the count of that group is greater than 3.
Required output:
ID col1 col2
A 11 0
A 14 0
B 15 0
B 95 1
c 0 1
c 9 1
what I am trying:
df.groupby('ID').apply(lambda x: x.iloc[:-1] if len(x)>3 else x).reset_index(drop=True)
CodePudding user response:
Use groupby
and cumcount
if you want to keep at most 3 rows per group.
out = df[df.groupby('ID').cumcount() < 2] # < 2 because cumcount starts at 0
print(out)
# Output
ID col1 col2
0 A 11 0
1 A 14 0
2 B 15 0
3 B 95 1
5 c 0 1
6 c 9 1
CodePudding user response:
Change the condition from >
to >=
out = df.groupby('ID').apply(lambda x: x.iloc[:-1] if len(x)>=3 else x).reset_index(drop=True)
Out[142]:
ID col1 col2
0 A 11 0
1 A 14 0
2 B 15 0
3 B 95 1
4 c 0 1
5 c 9 1