Add model predictions as a column in pandas but keep NaN as prediction if null values present in the-CodePudding

I have a pandas dataframe which has some null values and want to add a new column model_prediction which is model's predictions on the data. The model I have does not take null values and I want the model_prediction value to be NaN for those rows. The problem is the dataframe is very large and using df.iterrows is a very slow process and want to avoid it.

CodePudding user response：

Assuming your dataframe is df and model is model, please try this:

import numpy as np
df = df.reset_index(drop=True)
df_na = df[df.isna().any(axis=1)]
df_na.loc[:,'model_prediction'] = np.nan
df_model = df.dropna()
df_model.loc[:,'model_prediction'] = model.predict(df_model.values)
df = df_model.append(df_na).sort_index()