I have a dataframe df
:-
ID | Date | Event |
---|---|---|
3 | 04-09-2013 | Success |
3 | 27-09-2013 | Success |
3 | 14-10-2013 | Success |
3 | 15-10-2013 | |
3 | 18-10-2013 | Form |
3 | 19-10-2013 | Click |
If the first event for every group of ID
is Success
or followed by Consecutive Success
(note
strictly start with and /or followed by consecutive Success
)then I want to drop the ID from the dataframe(Drop all the rows for that ID
from the df
)
Also if the case where the group is this way :-
ID | Date | Event |
---|---|---|
4 | 04-09-2013 | Success |
4 | 27-09-2013 | Success |
4 | 14-10-2013 | |
4 | 15-10-2013 | |
4 | 18-10-2013 | Success |
4 | 18-10-2013 | Success |
4 | 18-10-2013 | Click |
Expected output to be (i.e a.)remove rows if the first event/consecutive events are Success
and also b.)drop the events(rows) after the last/latest success within each group(taken one example of Event click but if there are more events those are to be dropped as well):
ID | Date | Event |
---|---|---|
4 | 14-10-2013 | |
4 | 15-10-2013 | |
4 | 18-10-2013 | Success |
4 | 18-10-2013 | Success |
b.) is valid for every group irrespective of how the Event starts( for example ) :-
ID | Date | Event |
---|---|---|
5 | 14-10-2013 | |
5 | 15-10-2013 | Click |
5 | 18-10-2013 | Success |
5 | 18-10-2013 |
Expected :-
ID | Date | Event |
---|---|---|
5 | 14-10-2013 | |
5 | 15-10-2013 | Click |
5 | 18-10-2013 | Success |
How can this be achieved?
CodePudding user response:
Use:
#remove values after last Success per groups not remove only non Success groups
df = df.iloc[::-1]
m = df['Event'].eq('Success')
df = df[~m.groupby(df['ID']).transform('all') | m.groupby(df['ID']).cummax()].iloc[::-1]
#remove first conscutive Success per groups
m1 = df['Event'].ne(df.groupby(df['ID'])['Event'].shift()).groupby(df['ID']).cumsum().ne(1)
#alternative solution
#m1 = df['Event'].mask(df['Event'].eq('Success')).groupby(df['ID']).ffill().notna()
m2 = df['Event'].ne('Success')
df = df[m1 | m2]
print (df)
ID Date Event
8 4 14-10-2013 Email
9 4 15-10-2013 Email
10 4 18-10-2013 Success
11 4 18-10-2013 Success
13 5 14-10-2013 Email
14 5 15-10-2013 Click
15 5 18-10-2013 Success