I would like to fill NAs of a pandas series by the non-NAs values, but only if the non-NA values padding the NA values are the same. Is there any clever fast solution? I know I could write a function which I would use in the itterrows setting but I am operating with millions of rows and need a faster solution.
Example of Input:
0 NaN
1 1
2 NaN
3 1
4 NaN
5 2
6 NaN
7 NaN
8 2
9 NaN
Output:
0 NaN
1 1
2 1
3 1
4 NaN
5 2
6 2
7 2
8 2
9 NaN
CodePudding user response:
This might be a bit cheeky, but my first idea is to check where ffill
and bfill
would fill the same value.
>>> s
0 NaN
1 1.0
2 NaN
3 1.0
4 NaN
5 2.0
6 NaN
7 NaN
8 2.0
9 NaN
dtype: float64
>>> ffill = s.ffill()
>>> s[ffill.eq(s.bfill())] = ffill
>>> s
0 NaN
1 1.0
2 1.0
3 1.0
4 NaN
5 2.0
6 2.0
7 2.0
8 2.0
9 NaN
dtype: float64