I have a table like this:
event | value |
---|---|
seed | 57 |
ghy | 869 |
repo | 5324 |
repo | null |
repo | null |
trans | 32 |
harv | 12 |
weig | 6995 |
repo | 45 |
repo | null |
I would like to delete all records where event = 'repo', but after event = 'harv'.
I mean the expected result:
event | value |
---|---|
seed | 57 |
ghy | 869 |
repo | 5324 |
repo | null |
repo | null |
trans | 32 |
harv | 12 |
weig | 6995 |
Do you know how to do it the easiest way?
CodePudding user response:
Use Series.cummax
for match all next rows after matching harv
and chain by &
for bitwise AND
or |
for bitwise OR
with mask for compare for equal or not equal repo
:
df = df[~(df['event'].eq('harv').cummax() & df['event'].eq('repo'))]
#alternative
#df = df[~df['event'].eq('harv').cummax() | df['event'].ne('repo')]
print (df)
event value
0 seed 57.0
1 ghy 869.0
2 repo 5324.0
3 repo NaN
4 repo NaN
5 trans 32.0
6 harv 12.0
7 weig 6995.0