Having a data set as below.Here I need to group the subset in column and fill the missing values using mode method.
Need to group the value 'Tom' from name and fill the missing value in value using mode.
Name Value
0 Tom 20.0
1 Tom 20.0
2 Tom NaN
3 nick 19.0
4 nick 18.0
5 krish 15.0
6 nick 15.0
7 nick 20.0
8 jack NaN
9 Tom 23.0
10 Tom NaN
I tried the code below:
data['value']= data.loc[data["Name"]=='Tom'].fillna(data['value'].mode()[0], inplace=True)
Didn't get the output as expected.
CodePudding user response:
To replace on specific conditions you could use np.where()
to evaluate the condition, when it is met, replace it with the mode, otherwise retain the column's original value. Kindly try:
data['Value'] = np.where(data['Name'] =='Tom',data['Value'].fillna(data['Value'].mode()[0]),data['Value'])
This returns:
Name Value
0 Tom 20.0
1 Tom 20.0
2 Tom 20.0
3 nick 19.0
4 nick 18.0
5 krish 15.0
6 nick 15.0
7 nick 20.0
8 jack NaN
9 Tom 23.0
10 Tom 20.0
CodePudding user response:
You can fill all NaN
with:
import pandas as pd
import numpy as np
from scipy import stats
df['Value'] = df.groupby('Name')['Value'].transform(lambda x: x.fillna(stats.mode(x)[0][0]))
df
###
Name Value
0 Tom 20.0
1 Tom 20.0
2 Tom 20.0
3 nick 19.0
4 nick 18.0
5 krish 15.0
6 nick 15.0
7 nick 20.0
8 jack NaN
9 Tom 23.0
10 Tom 20.0
For multiple columns' condition
df2
###
Name Country Value
0 Tom USA 20.0
1 Tom UK 20.0
2 Tom UK NaN
3 nick USA 19.0
4 nick USA 18.0
5 Tom India 23.0
6 nick USA 15.0
7 nick USA 20.0
8 jack India NaN
9 Tom UK 23.0
10 Tom UK 23.0
df2['Value'] = df2.groupby(['Name','Country'])['Value'].transform(lambda x: x.fillna(stats.mode(x)[0][0]))
df2
###
Name Country Value
0 Tom USA 20.0
1 Tom UK 20.0
2 Tom UK 23.0
3 nick USA 19.0
4 nick USA 18.0
5 Tom India 23.0
6 nick USA 15.0
7 nick USA 20.0
8 jack India NaN
9 Tom UK 23.0
10 Tom UK 23.0