I have a timeseries dataframe as follows;
Volume
1992-04-01 357.073
1992-04-02 341.931
1992-04-03 318.777
1992-04-04 312.494
1992-04-05 270.837
.
.
.
2002-12-31 283.78
Some of the data has gaps and I would like to fill these gaps with the '10 year normal';
I can generate the normal by the following;
df_norm = df.groupby(by=[df.index.month, df.index.day]).mean()
which returns;
Volume
1 337.1108
2 362.6250
3 1 354.4670
4 364.3080
5 374.0428
and then I trying to fillna() of df with df_norm but struggling to get it right;
This isn't working as the indexing is different..
df = df.asfreq('d')
df = df.set_index(df.index.day).fillna(df_fut).set_index(df.index)
Is there a way around this?
Any help would be much appreciated!
CodePudding user response:
This should work:
df['Volume'] = df['Volume'].fillna(df.groupby(by=[df.index.month & df.index.day])['Volume'].transform('mean'))