Suppose I have a multi-index Pandas data frame with two index levels: month_begin and month_end
import pandas as pd
multi_index = pd.MultiIndex.from_tuples([("2022-03-01", "2022-03-31"),
("2022-04-01", "2022-04-30"),
("2022-05-01", "2022-05-31"),
("2022-06-01", "2022-06-30")])
multi_index.names = ['month_begin', 'month_end']
df = pd.DataFrame(np.random.rand(4,100), index=multi_index)
df
0 1 ... 98 99
month_begin month_end ...
2022-03-01 2022-03-31 0.322032 0.205307 ... 0.975128 0.673460
2022-04-01 2022-04-30 0.113813 0.278981 ... 0.951049 0.090765
2022-05-01 2022-05-31 0.777918 0.842734 ... 0.667831 0.274189
2022-06-01 2022-06-30 0.221407 0.555711 ... 0.745158 0.648246
What I want to do is to resample the data on a daily basis such that the monthly values show up for every day in the month:
0 1 ... 98 99
...
2022-03-01 0.322032 0.205307 ... 0.975128 0.673460
2022-03-02 0.322032 0.205307 ... 0.975128 0.673460
2022-03-03 0.322032 0.205307 ... 0.975128 0.673460
...
2022-06-29 0.221407 0.555711 ... 0.745158 0.648246
2022-06-30 0.221407 0.555711 ... 0.745158 0.648246
I tried to retrieve the month from the first index
df.index.get_level_values('month_begin'),
but then I don't know how to tell that it should resample it on a daily basis. Does anybody know how to do this? I would very much appreciate your help.
CodePudding user response:
Use:
#convert MultiIndex to columns
df1 = df.reset_index()
#if necessary convert to datetimes
df1['month_begin'] = pd.to_datetime(df1['month_begin'])
df1['month_end'] = pd.to_datetime(df1['month_end'])
#repeat indices by difference between end and start in days
df1 = (df1.loc[df1.index.repeat(df1['month_end'].sub(df1['month_begin']).dt.days 1)]
.drop('month_end',1))
#creat ecounter converted to days timedeltas
s = pd.to_timedelta(df1.groupby(level=0).cumcount(), unit='d')
#add timedeltas to month_begin and convert to DatetimeIndex
df1 = (df1.assign(month_begin = df1['month_begin'].add(s))
.set_index('month_begin')
.rename_axis(None))
Output for first 3 columns:
print (df1.iloc[:, :3])
0 1 2
2022-03-01 0.009359 0.499058 0.113384
2022-03-02 0.009359 0.499058 0.113384
2022-03-03 0.009359 0.499058 0.113384
2022-03-04 0.009359 0.499058 0.113384
2022-03-05 0.009359 0.499058 0.113384
... ... ...
2022-06-26 0.460350 0.241094 0.821568
2022-06-27 0.460350 0.241094 0.821568
2022-06-28 0.460350 0.241094 0.821568
2022-06-29 0.460350 0.241094 0.821568
2022-06-30 0.460350 0.241094 0.821568
[122 rows x 3 columns]