How to generate dates between two dates convert it to week nbr, add it to an existing dataframe and-CodePudding

I was hoping if anyone could help me with this use-case:

I want to generate dates between two dates and then tag each date with week number, then add both new generated dates and week number as new columns to the original dataframe and map it to user id.

this is the existing dataframe:

user_id	start_dt	end_dt
1	2022-01-01	2022-02-01
2	2022-01-14	2022-03-14
3	2022-01-05	2022-02-05
4	2022-01-25	2022-02-25

generating dates between start and end date and tag date with wk number

user_id	date	week_nbr
1	2022-01-01	w1
1	2022-01-02	w1
1	2022-01-03	w1
1	2022-01-04	w1
1	2022-01-05	w1
1	2022-01-06	w1
1	2022-01-07	w1
1	2022-01-08	w2

Finally map the generated wk and dates back to the original table using user_id:

user_id	start_dt	end_dt	date	week_nbr
1	2022-01-01	2022-02-01	2022-01-01	w1
1	2022-01-01	2022-02-01	2022-01-02	w1
1	2022-01-01	2022-02-01	2022-01-03	w1
1	2022-01-01	2022-02-01	2022-01-04	w1
1	2022-01-01	2022-02-01	2022-01-05	w1
1	2022-01-01	2022-02-01	2022-01-06	w1
1	2022-01-01	2022-02-01	2022-01-07	w1
1	2022-01-01	2022-02-01	2022-01-08	w2

Any thoughts?

CodePudding user response：

I believe this should give you what you are looking for:

(df.assign(
    date = [pd.date_range(i,j) for i,j in zip(df['start_dt'],df['end_dt'])]).explode('date')
 .assign(week_nbr = lambda x: x.groupby('user_id')['date']
         .diff()
         .dt.days
         .cumsum()
         .floordiv(7)
         .add(1,fill_value=0)
         .astype(int)
         .map('w{}'.format))
 .reset_index(drop=True))

Output: (top 5 rows)

     user_id   start_dt     end_dt       date week_nbr
0          1 2022-01-01 2022-02-01 2022-01-01       w1
1          1 2022-01-01 2022-02-01 2022-01-02       w1
2          1 2022-01-01 2022-02-01 2022-01-03       w1
3          1 2022-01-01 2022-02-01 2022-01-04       w1
4          1 2022-01-01 2022-02-01 2022-01-05       w1