Home > Mobile >  How to delete previous zero rows with the same id when the first condition is met in Pandas
How to delete previous zero rows with the same id when the first condition is met in Pandas

Time:10-13

I have the following dataset, I need to delete the previous 0 rows if flag is 1.

ID      Flag
103200  0
103200  1
103200  0
104752  0
104752  0
104752  1
104752  0
104752  1
104752  0
104752  0
104760  0
104760  1

Here is the result I want:

ID     Flag
103200  1
103200  0
104752  1
104752  0
104752  1
104752  0
104752  0
104760  1

CodePudding user response:

Use a groupby.cummax and boolean indexing:

out = df[df.groupby('ID')['Flag'].cummax().ne(0)]

# or
# out = df[df['Flag'].ne(0).groupby(df['ID']).cummax()]

output:

        ID  Flag
1   103200     1
2   103200     0
5   104752     1
6   104752     0
7   104752     1
8   104752     0
9   104752     0
11  104760     1

CodePudding user response:

use the dataframe index to get the index where flag==1 the drop row using the index minus 1.

data="""ID      Flag
103200  0
103200  1
103200  0
104752  0
104752  0
104752  1
104752  0
104752  1
104752  0
104752  0
104760  0
104760  1"""

mylist=[]
data = data.split("\n")
for item in data:
    elements=item.split('      ')
    if len(elements) == 1:
        elements=item.split('  ')
    mylist.append(elements)

df = pd.DataFrame(mylist)
df.columns = df.iloc[0]
df.drop(index=df.index[0], 
        axis=0, 
        inplace=True)
df["Flag"] = df["Flag"].astype(int)
df.drop(df[df["Flag"] == 1].index - 1, inplace=True)
print(df)

output

0       ID  Flag
2   103200     1
3   103200     0
4   104752     0
6   104752     1
8   104752     1
9   104752     0
10  104752     0
12  104760     1
  • Related