I have a dataframe which looks like this:
How do I drop columns which have 3 missing values from Q1 - Q8. Following which, for Q1 - Q8, if there are 2 or less missing value, to input default value as "0".
I have tried various forms of dropna(thresh=N) but I am not sure if it can read specific columns only.
CodePudding user response:
One thing you could do is :
Split your dataset in 2 datasets :
df_no_change = df[['A', 'B', 'C', 'D']]
df_change = df[['Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8']]
Then apply the column removal to df_change
df_change = df_change.dropna(thresh=len(df_change) - 2, axis=1)
And finally re concatenate your dataframes
df = pd.concat([df_change, df_no_change], axis=1)
This may work