Home > Blockchain >  Infill NAs with another dataframe
Infill NAs with another dataframe

Time:11-15

I have a timeseries dataframe as follows;


            Volume
1992-04-01  357.073
1992-04-02  341.931
1992-04-03  318.777
1992-04-04  312.494
1992-04-05  270.837
.
.
.
2002-12-31  283.78


Some of the data has gaps and I would like to fill these gaps with the '10 year normal';

I can generate the normal by the following;

df_norm = df.groupby(by=[df.index.month, df.index.day]).mean()

which returns;

        Volume
1       337.1108
2       362.6250
3   1    354.4670
4       364.3080
5       374.0428

and then I trying to fillna() of df with df_norm but struggling to get it right;

This isn't working as the indexing is different..

df  = df.asfreq('d')
df  = df.set_index(df.index.day).fillna(df_fut).set_index(df.index)

Is there a way around this?

Any help would be much appreciated!

CodePudding user response:

This should work:

df['Volume'] = df['Volume'].fillna(df.groupby(by=[df.index.month & df.index.day])['Volume'].transform('mean'))
  • Related