Existing Dataframe :
ID Activity Action No. Status
A1 register 1 in
A1 fill form 2 in
A2 Payment Done 3 success
A2 fill form 2 in
B1 fill form 1 in
B1 Payment Done 2 success
B1 Process_drop 3 in
B1 fill form 1 in
Expected Dataframe :
ID Activity Action No. Status
A1 register 1 in
A1 fill form 2 in
A2 Payment Done 3 success
B1 Payment Done 2 success
Need to drop the remaining rows for each ID , once Activity == Payment Done and Status == success for that particular ID.( i.e. further rows needs to be removed once the condition is achieved)
i approached by grouping the ID and applying the conditions on the columns , but not sure about dropping the rows
CodePudding user response:
cond = df.Activity.eq("Payment Done") & df.Status.eq("success")
mask = cond.groupby(df.ID).cummax().shift(fill_value=False)
new = df[~mask]
- get the terminating condition across frame
- group that True/False condition by IDs
- take the cumulative maximum; once True always True
- shift it because you want to drop after
- negate it so that we keep False's, i.e., including & before terminating condition
In [115]: new
Out[115]:
ID Activity Action No. Status
0 A1 register 1 in
1 A1 fill form 2 in
2 A2 Payment Done 3 success
5 B1 Payment Done 2 success
CodePudding user response:
You can use boolean indexing.
You can determine if there is any match per group and drop the rows:
# is the row matching your criteria?
m = df['Activity'].eq('Payment Done') & df['Status'].eq('success')
# is there NOT any match in the group?
m2 = ~m.groupby(df['ID']).transform('any')
# keep rows matching either of the above
df[m|m2]
output:
ID Activity Action No. Status
0 A1 register 1 in
1 A1 fill form 2 in
2 A2 Payment Done 3 success
5 B1 Payment Done 2 success
intermediates:
ID Activity Action No. Status m m2 m|m2
0 A1 register 1 in False True True
1 A1 fill form 2 in False True True
2 A2 Payment Done 3 success True False True
3 A2 fill form 2 in False False False
4 B1 fill form 1 in False False False
5 B1 Payment Done 2 success True False True
6 B1 Process_drop 3 in False False False
7 B1 fill form 1 in False False False