I would like to optimize my program which should check the date and write extra rows which will after all show 6 months with the input date. E.g. if I have date 2021-01-01 for some ID I will get rows with the same ID and date for 6 months in a row with the started one which is JAN 2021 (this is shown in example below).
I have a data frame with two columns: ID and Demand_date as it is in input:
ID Demand_date
001 2021-01-01
002 2016-03-01
003 2015-10-01
And the output should look like that:
ID Demand_date
001 2021-01-01
001 2021-02-01
001 2021-03-01
001 2021-04-01
001 2021-05-01
001 2021-06-01
002 2016-03-01
002 2016-04-01
002 2016-05-01
002 2016-06-01
002 2016-07-01
002 2016-08-01
003 2015-10-01
003 2015-11-01
003 2015-12-01
003 2016-01-01
003 2016-02-01
003 2016-03-01
I have too many loops in my program and for 1M of data it is taking too long time.
CodePudding user response:
You can map a date range, and explode it into a new dataframe
temp = DF["DATE"].map(lambda x: pd.date_range(x, periods=3, freq='D')).reset_index()
temp = temp.rename(columns={"index":"ID"})
temp = temp.explode("DATE")
temp
That results in a daily resample;
I imagine you can figure out the monthly resample.
Hope that helps!