Home > Enterprise >  Pandas filling nan in between two numbers
Pandas filling nan in between two numbers

Time:10-19

In a very large dataset I want to fill in nan values in between two values. The dataset looks like this:

df = pd.DataFrame ({'col1':[1,2,3,4,5,6,7,8,9,10,11,12],
                 'col2':[100,np.nan,100,np.nan,np.nan,999,999,np.nan,100,np.nan,np.nan,100]})

I want to only fill in those nan values that occur in between the 999 and 100. Expected output is:

df['col2'] = [100,np.nan,100,500,500,999,999,500,100,np.nan,np.nan,100]

CodePudding user response:

If is possible test forward and backfilled missing values by 999 and 100 use:

s1 = df['col2'].ffill()
s2 = df['col2'].bfill()

df.loc[(s1.eq(999) & s2.eq(100)) | (s1.eq(100) & s2.eq(999)), 'col2'] = 500
print(df)
    col1   col2
0      1  100.0
1      2    NaN
2      3  100.0
3      4  500.0
4      5  500.0
5      6  999.0
6      7  999.0
7      8  500.0
8      9  100.0
9     10    NaN
10    11    NaN
11    12  100.0
  • Related