I have df:
date id label pred
1/1 1 0 0.2
2/1 1 1 0.5
1/1 2 1 0.9
2/1 2 1 0.3
I want for each id, get the first row when label column equal to 1. for example desire df:
date id label pred
2/1 1 1 0.3
1/1 2 1 0.9
thx!
CodePudding user response:
First filter only rows with label=1
and then remove duplicates per id
by DataFrame.drop_duplicates
:
df1 = df[df['label'].eq(1)].drop_duplicates('id')
CodePudding user response:
You can use groupby
and take the first row after keep only rows where label is set to 1:
out = df[df['label'] == 1].groupby('id', as_index=False).first()
print(out)
# Output
id date label pred
0 1 2/1 1 0.5
1 2 1/1 1 0.9