I have a dataframe of patients Vital signs (HR, O2Sat, Temp, SBP, DBP, Resp) with NaN values. I filled NaN in individual patient based on Patient ID (P_ID) column using the code:
m['HR'] = m['HR'].fillna(m.groupby('P_ID')['HR'].transform('mean'))
m['O2Sat'] = m['O2Sat'].fillna(m.groupby('P_ID')['O2Sat'].transform('mean'))
m['Temp'] = m['Temp'].fillna(m.groupby('P_ID')['Temp'].transform('mean'))
m['SBP'] = m['SBP'].fillna(m.groupby('P_ID')['SBP'].transform('mean'))
m['DBP'] = m['DBP'].fillna(m.groupby('P_ID')['DBP'].transform('mean'))
m['Resp'] = m['Resp'].fillna(m.groupby('P_ID')['Resp'].transform('mean'))
It worked perfectly. However it is a lot of code. Is there anyway I use for loop to fill NaN values in only the vital columns? As there are some more columns without NaN values. Thanks.
CodePudding user response:
Yes, you can use a loop
for i in ['HR','O2Sat','Temp','SBP','DBP','Resp']:
m[i] = m[i].fillna(m.groupby('P_ID')[i].transform('mean'))
CodePudding user response:
In the interest of avoiding unnecessary lines and lengthy scripts, you can use the columns
attribute of the dataframe. But note that, since you mentioned vital columns, there could be a few columns that you don't want to do the following code.:
for column in m.columns:
m[column] = m.fillna(m.groupby('P_ID')[column].transform('mean'))