I want to know how can I replace the NaN in my dataset with the last average of 5 last values.
Column A | Column B |
---|---|
1 | 2 |
2 | 5 |
3 | 5 |
4 | 2 |
5 | 2 |
NaN | 2 |
NaN | 2 |
1 | 2 |
1 | 2 |
1 | 2 |
1 | NaN |
1 | 2 |
1 | 2 |
For example, in this case the first NaN will be the average of (1,2,3,4,5) and second NaN will be the average of (2,3,4,5, The value of the other NaN).
I have tried
df.fillna(df.mean())
CodePudding user response:
As mentioned, it has been answered here, but the updated version for the latest pandas version is as follow:
data={'col1':[1,2,3,4,5,np.nan,np.nan,1,1,1,1,1,1],
'col2':[2,5,5,2,2,2,2,2,2,2,np.nan,2,2]}
df=pd.DataFrame(data)
window_size = 5
df=df.fillna(df.rolling(window_size 1, min_periods=1).mean())
outputs:
col1 col2
0 1.0 2.0
1 2.0 5.0
2 3.0 5.0
3 4.0 2.0
4 5.0 2.0
5 3.0 2.0
6 3.5 2.0
7 1.0 2.0
8 1.0 2.0
9 1.0 2.0
10 1.0 2.0
11 1.0 2.0
12 1.0 2.0