How to delete previous zero rows with the same id when the first condition is met in Pandas-CodePudding

I have the following dataset, I need to delete the previous 0 rows if flag is 1.

ID      Flag
103200  0
103200  1
103200  0
104752  0
104752  0
104752  1
104752  0
104752  1
104752  0
104752  0
104760  0
104760  1

Here is the result I want:

ID     Flag
103200  1
103200  0
104752  1
104752  0
104752  1
104752  0
104752  0
104760  1

CodePudding user response：

Use a groupby.cummax and boolean indexing:

out = df[df.groupby('ID')['Flag'].cummax().ne(0)]

# or
# out = df[df['Flag'].ne(0).groupby(df['ID']).cummax()]

output:

        ID  Flag
1   103200     1
2   103200     0
5   104752     1
6   104752     0
7   104752     1
8   104752     0
9   104752     0
11  104760     1

CodePudding user response：

use the dataframe index to get the index where flag==1 the drop row using the index minus 1.

data="""ID      Flag
103200  0
103200  1
103200  0
104752  0
104752  0
104752  1
104752  0
104752  1
104752  0
104752  0
104760  0
104760  1"""

mylist=[]
data = data.split("\n")
for item in data:
    elements=item.split('      ')
    if len(elements) == 1:
        elements=item.split('  ')
    mylist.append(elements)

df = pd.DataFrame(mylist)
df.columns = df.iloc[0]
df.drop(index=df.index[0], 
        axis=0, 
        inplace=True)
df["Flag"] = df["Flag"].astype(int)
df.drop(df[df["Flag"] == 1].index - 1, inplace=True)
print(df)

output

0       ID  Flag
2   103200     1
3   103200     0
4   104752     0
6   104752     1
8   104752     1
9   104752     0
10  104752     0
12  104760     1