How to count the number of days since a column flag?-CodePudding

I have a dataframe defined as follows. I'd like to count the number of days (or rows) when the input column changes from 1 to 0:

import pandas as pd
df = pd.DataFrame({'input': [1,1,1,0,0,0,1,1,1,0,0,0]}, 
                  index=pd.date_range('2021-10-01', periods=12))
# I can mark the points of interest, i.e. when it goes from 1 to 0
df['change'] = 0
df.loc[(df['input'].shift(1) - df['input']) > 0, 'change'] = 1
print(df)

I end up with the following:

            input   change
2021-10-01      1        0
2021-10-02      1        0 
2021-10-03      1        0
2021-10-04      0        1
2021-10-05      0        0
2021-10-06      0        0
2021-10-07      1        0
2021-10-08      1        0
2021-10-09      1        0
2021-10-10      0        1
2021-10-11      0        0
2021-10-12      0        0

What I want is a res output:

            input   change     res
2021-10-01      1        0       0
2021-10-02      1        0       0  
2021-10-03      1        0       0
2021-10-04      0        1       1
2021-10-05      0        0       2
2021-10-06      0        0       3
2021-10-07      1        0       0
2021-10-08      1        0       0
2021-10-09      1        0       0
2021-10-10      0        1       1
2021-10-11      0        0       2
2021-10-12      0        0       3

I know I can use a cumsum but don't find a way to "reset it" at the appropriate points:

df['res'] = (1 - df['input']).cumsum()*(1 - df['input'])

but this above will continue accumulating and not reset where change == 1

CodePudding user response：

We can create a boolean Series only where input eq 0 then group by consecutive values and take the groupby cumsum of the boolean Series. This is essentially enumerating groups, but only groups where there are 0s in input.

0:

m = df['input'].eq(0)
df['res'] = m.groupby(m.ne(m.shift()).cumsum()).cumsum()

df:

            input  change  res
2021-10-01      1       0    0
2021-10-02      1       0    0
2021-10-03      1       0    0
2021-10-04      0       1    1
2021-10-05      0       0    2
2021-10-06      0       0    3
2021-10-07      1       0    0
2021-10-08      1       0    0
2021-10-09      1       0    0
2021-10-10      0       1    1
2021-10-11      0       0    2
2021-10-12      0       0    3