How do I upsample a dataframe using resample() to get the initial values divided over the new sample frequency?
Dataframe with monthly sample frequency
date revenue
0 2021-11-01 00:00:00 00:00 300
1 2021-10-01 00:00:00 00:00 500
2 2021-09-01 00:00:00 00:00 100
3 2021-08-01 00:00:00 00:00 50
4 2021-07-01 00:00:00 00:00 200
5 2021-06-01 00:00:00 00:00 150
Approximate expected Dataframe with revenue
divided over the days in that month
revenue
date
2021-06-01 00:00:00 00:00 4.8
2021-06-02 00:00:00 00:00 4.8
2021-06-03 00:00:00 00:00 4.8
2021-06-04 00:00:00 00:00 4.8
2021-06-05 00:00:00 00:00 4.8
... ...
2021-11-28 00:00:00 00:00 9.6
2021-11-29 00:00:00 00:00 9.6
2021-11-30 00:00:00 00:00 9.6
2021-11-31 00:00:00 00:00 9.6
ie, i want to be sure that the values get divided over the amount of days in that sepcific month
CodePudding user response:
You can use asfreq
to convert the timeseries from monthly to daily frequency, then use ffill
to forward fill the values then divide the revenue
by daysinmonth
attribute of datetimeindex
to calculate distributed revenue
s = df.set_index('date')
s.loc[s.index.max() pd.offsets.MonthEnd()] = np.nan
s = s.asfreq('D').ffill()
s['revenue'] /= s.index.daysinmonth
print(s)
revenue
date
2021-06-01 00:00:00 00:00 5.000000
2021-06-02 00:00:00 00:00 5.000000
2021-06-03 00:00:00 00:00 5.000000
2021-06-04 00:00:00 00:00 5.000000
2021-06-05 00:00:00 00:00 5.000000
...
2021-07-24 00:00:00 00:00 6.451613
2021-07-25 00:00:00 00:00 6.451613
...
2021-11-30 00:00:00 00:00 10.000000