Home > Blockchain >  Replacing or sequencing in pandas dataframe column based on previous values and other column
Replacing or sequencing in pandas dataframe column based on previous values and other column

Time:06-16

I have a pandas df:

date          day_of_week   day
2021-01-01    3              1
2021-01-02    4              2 
2021-01-03    5              0
2021-01-04    6              1
2021-01-05    7              2
2021-01-06    1              3
2021-01-07    2              0
2021-01-08    3              0

I would like to change numeration for 'day' column based on the 'day_of_week' column values. For example, if the event starts before Thursday (<4) I want to use numbering for 'day' column values, which are greater than 0, from 20 (instead of 1) and forth. If the event starts on Thursday but before Monday (>=4) I want to use numbering for values, which are greater than 0, from 30 (instead of 1) and forth.

The table should look like this:

date          day_of_week   day
    2021-01-01    3              20
    2021-01-02    4              21 
    2021-01-03    5              0
    2021-01-04    6              30
    2021-01-05    7              31
    2021-01-06    1              32
    2021-01-07    2              0
    2021-01-08    3              0

I tried to use np.where to substitute values but I don't how to iterate through rows and insert values based on previous rows.

Please help!

CodePudding user response:

We can use cumsum create the group then select the 20 or 30 by transform first day of each group

s = df.groupby(df['day'].eq(1).cumsum())['day_of_week'].transform('first')
df['day'] = df.day.where(df.day==0, df.day   np.where(s<4,19,29))
df
Out[16]: 
         date  day_of_week  day
0  2021-01-01            3   20
1  2021-01-02            4   21
2  2021-01-03            5    0
3  2021-01-04            6   30
4  2021-01-05            7   31
5  2021-01-06            1   32
6  2021-01-07            2    0
7  2021-01-08            3    0
  • Related