I have some null values on on column of my dataframe. I used a linear regression to predict the missing values but now I want to replace nan by the predicted value. I would like to use the index as condition to fillna beacause I don't want all the other values beeing predicted.
here my null in the dataframe : df
is a b c d e f
72 True 171.94 103.89 103.45 NaN 3.25 112.79
99 True 171.93 104.07 104.18 NaN 3.14 113.08
151 True 172.07 103.80 104.38 NaN 3.02 112.93
197 True 171.45 103.66 103.80 NaN 3.62 113.27
241 True 171.83 104.14 104.06 NaN 3.02 112.36
Here the Series return for the missing values to fill with index: prev
72 4.318525
99 4.393668
151 4.410457
197 4.319014
241 4.650617
I don't know what is the best to fill the missing values and I want to be sure is the correct values filled sharing same index position, with a loop for?
CodePudding user response:
You can use fillna
(based on index)
df = df.fillna({'d': prev})
# OR
df['d'] = df['d'].fillna(prev)
Output:
>>> df
is a b c d e f
72 True 171.94 103.89 103.45 4.318525 3.25 112.79
99 True 171.93 104.07 104.18 4.393668 3.14 113.08
151 True 172.07 103.80 104.38 4.410457 3.02 112.93
197 True 171.45 103.66 103.80 4.319014 3.62 113.27
241 True 171.83 104.14 104.06 4.650617 3.02 112.36
CodePudding user response:
If the index's are the same (just make sure the data from your data df has the same column name as the dataframe you are mergine to i.e. 'd'):
df_merge = pd.merge(df, df_data, left_index = True, right_index = True, suffixes=('_x', '')).drop('d_x', axis = 1)
df_merge