Home > database >  Fill NA only between 2 same non-NA values
Fill NA only between 2 same non-NA values

Time:02-21

I would like to fill NAs of a pandas series by the non-NAs values, but only if the non-NA values padding the NA values are the same. Is there any clever fast solution? I know I could write a function which I would use in the itterrows setting but I am operating with millions of rows and need a faster solution.

Example of Input:

0    NaN
1    1
2    NaN
3    1
4    NaN
5    2
6    NaN
7    NaN
8    2
9    NaN

Output:

0    NaN
1    1
2    1
3    1
4    NaN
5    2
6    2
7    2
8    2
9    NaN

CodePudding user response:

This might be a bit cheeky, but my first idea is to check where ffill and bfill would fill the same value.

>>> s
0    NaN
1    1.0
2    NaN
3    1.0
4    NaN
5    2.0
6    NaN
7    NaN
8    2.0
9    NaN
dtype: float64
>>> ffill = s.ffill()
>>> s[ffill.eq(s.bfill())] = ffill
>>> s
0    NaN
1    1.0
2    1.0
3    1.0
4    NaN
5    2.0
6    2.0
7    2.0
8    2.0
9    NaN
dtype: float64
  • Related