Home > Mobile >  Resampling on a multi index
Resampling on a multi index

Time:01-14

I have a DataFrame of the following form: enter image description here

You see that it has a multi index. For each muni index I want to do a resampling of the form .resample('A').mean() of the popDate index. Hence, I want python to fill in the missing years. NaN values shall be replaced by a linear interpolation. How do I do that?

Update: Some mock input DataFrame:

interData=pd.DataFrame({'muni':['Q1','Q1','Q1','Q2','Q2','Q2'],'popDate':['2015','2021','2022','2015','2017','2022'],'population':[5,11,22,15,17,22]})
interData['popDate']=pd.to_datetime(interData['popDate'])
interData=interData.set_index(['muni','popDate'])

CodePudding user response:

It looks like you want a groupby.resample:

interData.groupby(level='muni').resample('A', level='popDate').mean()

Output:

                 population
muni popDate               
Q1   2015-12-31         5.0
     2016-12-31         NaN
     2017-12-31         NaN
     2018-12-31         NaN
     2019-12-31         NaN
     2020-12-31         NaN
     2021-12-31        11.0
     2022-12-31        22.0
Q2   2015-12-31        15.0
     2016-12-31         NaN
     2017-12-31        17.0
     2018-12-31         NaN
     2019-12-31         NaN
     2020-12-31         NaN
     2021-12-31         NaN
     2022-12-31        22.0

If you also need interpolation, combine with interpolate:

out = (interData.groupby(level='muni')
       .apply(lambda g: g.resample('A', level='popDate').mean()
                         .interpolate(method='time'))
      )

Output:

                 population
muni popDate               
Q1   2015-12-31    5.000000
     2016-12-31    6.001825
     2017-12-31    7.000912
     2018-12-31    8.000000
     2019-12-31    8.999088
     2020-12-31   10.000912
     2021-12-31   11.000000
     2022-12-31   22.000000
Q2   2015-12-31   15.000000 # 366 days between 2015-12-31 and 2016-12-31
     2016-12-31   16.001368 # 365 days between 2016-12-31 and 2017-12-31
     2017-12-31   17.000000
     2018-12-31   17.999452
     2019-12-31   18.998905
     2020-12-31   20.001095
     2021-12-31   21.000548
     2022-12-31   22.000000
  • Related