I want to change the NA values in multiple columns when certain condition is met. Example data set given below.
Pool Area Pool Quality Pool Type Pool Condition Pool Finish
0 800 Good A Good Gunite
1 400 Good C Good Vinyl
2 485 Good B Good Fibreglass
3 360 Poor C Poor Vinyl
4 0 NaN NaN NaN NaN
5 600 Best A Best Gunite
6 500 Best B Best Fibreglass
7 0 NaN NaN NaN NaN
8 750 Best A Best Gunite
9 900 Best A Best Gunite
10 0 NaN NaN NaN NaN
11 900 Best A Best Gunite
12 400 Poor C Poor Fibreglass
13 0 NaN NaN NaN NaN
In the above data, I want to replace the NaN values with 'No Pool' where the 'Pool Area' column has value '0'.
I know I can do it with np.where function and I tried the below code.
df[['Pool Quality', 'Pool Type', 'Pool Condition', 'Pool Finish']] = np.where(df['Pool Area']==0, 'No Pool', df[['Pool Quality', 'Pool Type', 'Pool Condition', 'Pool Finish']])
It isn't working.
I tried it separately, it works (Ref to code below).
df['Pool Quality'] = np.where(df['Pool Area']==0, 'No Pool', df['Pool Quality'])
But when I tried to do for multiple columns in one go, it is not working.
Below is the error I got.
ValueError: operands could not be broadcast together with shapes (2919,) () (2919,5)
NOTE: The above error message is taken from my actual data set where the dimension is 2919 rows and 81 columns.
I don't know what is wrong in my code. Please help me.
CodePudding user response:
Use boolean indexing:
m = df['Pool Area'].eq(0)
df.loc[m] = df.loc[m].fillna('No Pool')
# or
# df[m] = df[m].fillna('No Pool')
# or to limit to given columns
# cols = ['Pool Quality', 'Pool Type', 'Pool Condition', 'Pool Finish']
# df.loc[m, cols] = df.loc[m, cols].fillna('No Pool')
updated df
:
Pool Area Pool Quality Pool Type Pool Condition Pool Finish
0 800 Good A Good Gunite
1 400 Good C Good Vinyl
2 485 Good B Good Fibreglass
3 360 Poor C Poor Vinyl
4 0 No Pool No Pool No Pool No Pool
5 600 Best A Best Gunite
6 500 Best B Best Fibreglass
7 0 No Pool No Pool No Pool No Pool
8 750 Best A Best Gunite
9 900 Best A Best Gunite
10 0 No Pool No Pool No Pool No Pool
11 900 Best A Best Gunite
12 400 Poor C Poor Fibreglass
13 0 No Pool No Pool No Pool No Pool