I have a pandas df:
date day_of_week day
2021-01-01 3 1
2021-01-02 4 2
2021-01-03 5 0
2021-01-04 6 1
2021-01-05 7 2
2021-01-06 1 3
2021-01-07 2 0
2021-01-08 3 0
I would like to change numeration for 'day' column based on the 'day_of_week' column values. For example, if the event starts before Thursday (<4) I want to use numbering for 'day' column values, which are greater than 0, from 20 (instead of 1) and forth. If the event starts on Thursday but before Monday (>=4) I want to use numbering for values, which are greater than 0, from 30 (instead of 1) and forth.
The table should look like this:
date day_of_week day
2021-01-01 3 20
2021-01-02 4 21
2021-01-03 5 0
2021-01-04 6 30
2021-01-05 7 31
2021-01-06 1 32
2021-01-07 2 0
2021-01-08 3 0
I tried to use np.where
to substitute values but I don't how to iterate through rows and insert values based on previous rows.
Please help!
CodePudding user response:
We can use cumsum
create the group then select the 20 or 30 by transform
first
day of each group
s = df.groupby(df['day'].eq(1).cumsum())['day_of_week'].transform('first')
df['day'] = df.day.where(df.day==0, df.day np.where(s<4,19,29))
df
Out[16]:
date day_of_week day
0 2021-01-01 3 20
1 2021-01-02 4 21
2 2021-01-03 5 0
3 2021-01-04 6 30
4 2021-01-05 7 31
5 2021-01-06 1 32
6 2021-01-07 2 0
7 2021-01-08 3 0