I have two data frames: main and auxiliary. I am concatenating auxiliary to the main. It results in NaN in a few rows and I want to fill them, not all. Code:
df1 = pd.DataFrame({'Main':[00,10,20,30,40,50,60,70,80]})
df1 =
Main
0 0
1 10
2 20
3 30
4 40
5 50
6 60
7 70
8 80
df2 = pd.DataFrame({'aux':['aa','aa','bb','bb']},index=[0,2,5,7])
df2 =
aux
0 aa
2 aa
5 bb
7 bb
df = pd.concat([df1,df2],axis=1)
# After concating, in the aux column, I want to fill the NaN rows in between
# the rows with same value. Example, fill rows between 0 and 2 with 'aa', 2 and 5 NaN, 5 and 7 with 'bb'
df = pd.concat([df1,df2],axis=1).fillna(method='ffill')
print(df)
Present result:
Main aux
0 0 aa
1 10 aa
2 20 aa
3 30 aa # Wrong, here it should be NaN
4 40 aa # Wrong, here it should be NaN
5 50 bb
6 60 bb
7 70 bb
8 80 bb # Wrong, here it should be NaN
Expected result:
Main aux
0 0 aa
1 10 aa
2 20 aa
3 30 NaN
4 40 NaN
5 50 bb
6 60 bb
7 70 bb
8 80 NaN
CodePudding user response:
If I understand correctly, what you want can be done like this. You want to fill the NaNs where backfill and forward fill give the same value.
ff = df.aux.ffill()
bf = df.aux.bfill()
df.aux = ff[ff == bf]