Existing Dataframe :
Id Date Status
A 26-01-2022 begin
A 26-01-2022 failed
A 27-01-2022 begin
A 27-01-2022 in-process
A 27-01-2022 success
B 01-02-2022 in-process
B 01-02-2022 success
B 02-02-2022 begin
B 02-02-2022 failed
Expected Dataframe :
Id Date Status
A 27-01-2022 begin
A 27-01-2022 in-process
A 27-01-2022 success
B 01-02-2022 in-process
B 01-02-2022 success
Need to keep all the activity of the Id where the status for that Id ends with success for that particular date.
tried grouping Id and Date, but stuck with dropping the relevant rows
CodePudding user response:
Use transform('last')
:
df[df.groupby(['Id', 'Date']).Status.transform('last') == 'success']
CodePudding user response:
You can use groupby
and filter
by last value of Status
column
out = (df.groupby(['Id', 'Date'])
.filter(lambda g: g['Status'].iloc[-1] == 'success'))
print(out)
Id Date Status
2 A 27-01-2022 begin
3 A 27-01-2022 in-process
4 A 27-01-2022 success
5 B 01-02-2022 in-process
6 B 01-02-2022 success