I have one datafrmae which includes two columns flag and flag1, i want to check if flag column values greater than 1 for 5 times or greater than 5 times continuous flag1 value should change to 1
here is example
df=pd.DataFrame({'flag':[0,0,1,1,1,1,1,1,1,0,0,0],'flag1':[0,0,0,0,0,0,1,0,0,0,0,0]})
CodePudding user response:
Idea is create consecutive counts and then test 5
if equal:
a = df['flag'].eq(1)
#https://stackoverflow.com/a/52718619/2901002
b = a.cumsum()
df['new'] = b.sub(b.mask(a).ffill().fillna(0)).eq(5).astype(int)
print (df)
flag flag1 new
0 0 0 0
1 0 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 1 1
7 1 0 0
8 1 0 0
9 0 0 0
10 0 0 0
11 0 0 0
Detail:
print (b.sub(b.mask(a).ffill().fillna(0)))
0 0.0
1 0.0
2 1.0
3 2.0
4 3.0
5 4.0
6 5.0
7 6.0
8 7.0
9 0.0
10 0.0
11 0.0
Name: flag, dtype: float64
CodePudding user response:
setup
import pandas as pd
df=pd.DataFrame({'flag':[0,0,1,1,1,1,1,1,1,0,0,0],'flag1':[0,0,0,0,0,0,1,0,0,0,0,0]})
solution
rolling_sum = df["flag"].rolling(5).sum()
df["check"] = ((rolling_sum == 5) & (rolling_sum.diff() == 1)).astype(int)
flag flag1 check
0 0 0 0
1 0 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 1 1
7 1 0 0
8 1 0 0
9 0 0 0
10 0 0 0
11 0 0 0