Pandas duplicate data between 2 dates-CodePudding

I'm trying to find a method of duplicating all of the data in a row for every month between dates. Start date and end date.

This is the dataset:

ID	Start	End
1007	2022-03-01	2022-08-01
1008	2019-11-01	2020-02-01

What I would like to do is repeat the row, incrementing the date, every month between the start and end values. Example outcome:

ID	Start	End
1007	2022-03-01	2022-08-01
1007	2022-04-01	2022-08-01
1007	2022-05-01	2022-08-01
1007	2022-06-01	2022-08-01
1007	2022-07-01	2022-08-01
1007	2022-07-01	2022-08-01
1008	2019-11-01	2020-02-01
1008	2019-12-01	2020-02-01
1008	2020-01-01	2020-02-01
1008	2020-02-01	2020-02-01

Thanks

CodePudding user response：

you can move in all row data and check data_start is preset start duplicated and when present the date_end can you exit the loop Thanks

CodePudding user response：

here is one way to do it

create range from start and end, then use explode. drop the temporary created column 'range' and remove rows where start and end date are same

df=df.assign(range=df.apply(lambda x: pd.date_range(x['Start'], x['End'], freq='MS').strftime('%Y-%m-%d').tolist(), axis=1)).explode('range').reset_index()
df['Start'] = df['range']
df.drop(df[df['Start'] == df['End']].index, inplace=True)
df.drop(columns=['index','range'], inplace=True)
df

    ID           Start        End
0   1007    2022-03-01  2022-08-01
1   1007    2022-04-01  2022-08-01
2   1007    2022-05-01  2022-08-01
3   1007    2022-06-01  2022-08-01
4   1007    2022-07-01  2022-08-01
6   1008    2019-11-01  2020-02-01
7   1008    2019-12-01  2020-02-01
8   1008    2020-01-01  2020-02-01