Home > Blockchain >  how to add conditional counter to pandas dataframe
how to add conditional counter to pandas dataframe

Time:11-02

I have a dataframe as shown below:

id  date       notify
3   04/09/2019  no
3   30/10/2019  yes
3   03/05/2020  no
3   05/09/2020  no
3   31/10/2020  yes
3   03/11/2020  no
5   03/09/2019  no
5   27/10/2019  yes
5   02/05/2020  no

I want to create a counter group number for every time a "notify" is "yes". I then want to apply the same number to the following row where "notify" will always be "no".

It should look like this:

id  date       notify time_group
3   04/09/2019  no       
3   30/10/2019  yes       1
3   03/05/2020  no        1
3   05/09/2020  no
3   31/10/2020  yes       2
3   03/11/2020  no        2
5   03/09/2019  no
5   27/10/2019  yes       3
5   02/05/2020  no        3

At the moment I have tried this without much success:

i = 0
df['time_grp'] = np.nan
for row in df.iterrows():
    if row['notify'] == 'yes':
        row['time_group'] = i
        i  = 1

I was wondering if there is a more pandas friendly way of accomplishing this? Perhaps making use of cumcount()? I know I can then probably use shift(-1) for applying the same counter group number to the next row...

CodePudding user response:

Try:

# mark the `yes` rows
s = df['notify'].eq('yes')


# s.cumsum() enumerate the blocks
# maybe `s.groupby(df['id']).cumsum() if enumeration within id
df['time_group'] = s.cumsum().where(               # use `where` to keep      
     s |                                           # the `yes` rows
     s.groupby(df['id']).shift(fill_value=False)   # and those after
)
  • Related