Home > Net >  resample dataframe and divide values over new sample frequency
resample dataframe and divide values over new sample frequency

Time:05-27

How do I upsample a dataframe using resample() to get the initial values divided over the new sample frequency?

Dataframe with monthly sample frequency

                       date        revenue
0 2021-11-01 00:00:00 00:00        300
1 2021-10-01 00:00:00 00:00        500
2 2021-09-01 00:00:00 00:00        100
3 2021-08-01 00:00:00 00:00        50
4 2021-07-01 00:00:00 00:00        200
5 2021-06-01 00:00:00 00:00        150

Approximate expected Dataframe with revenue divided over the days in that month

                                 revenue
date                                    
2021-06-01 00:00:00 00:00    4.8
2021-06-02 00:00:00 00:00    4.8
2021-06-03 00:00:00 00:00    4.8
2021-06-04 00:00:00 00:00    4.8
2021-06-05 00:00:00 00:00    4.8
...                                  ...
2021-11-28 00:00:00 00:00    9.6
2021-11-29 00:00:00 00:00    9.6
2021-11-30 00:00:00 00:00    9.6
2021-11-31 00:00:00 00:00    9.6

ie, i want to be sure that the values get divided over the amount of days in that sepcific month

CodePudding user response:

You can use asfreq to convert the timeseries from monthly to daily frequency, then use ffill to forward fill the values then divide the revenue by daysinmonth attribute of datetimeindex to calculate distributed revenue

s = df.set_index('date')
s.loc[s.index.max()   pd.offsets.MonthEnd()] = np.nan

s = s.asfreq('D').ffill()
s['revenue'] /= s.index.daysinmonth

print(s)
                             revenue
date                                
2021-06-01 00:00:00 00:00   5.000000
2021-06-02 00:00:00 00:00   5.000000
2021-06-03 00:00:00 00:00   5.000000
2021-06-04 00:00:00 00:00   5.000000
2021-06-05 00:00:00 00:00   5.000000
...
2021-07-24 00:00:00 00:00   6.451613
2021-07-25 00:00:00 00:00   6.451613
...
2021-11-30 00:00:00 00:00  10.000000
  • Related