Having a data set as below.Here I need to group the subset in column and fill the missing values using mode method.
Need to group the value 'Tom' from name and 'UK' from Country and fill the missing value in value using mode.
Name | Country | Value |
---|---|---|
Tom | USA | 30.0 |
Tom | UK | 20.0 |
Tom | UK | NaN |
nick | USA | 19.0 |
nick | USA | 18.0 |
Tom | India | 23.0 |
nick | USA | 15.0 |
nick | USA | 20.0 |
jack | India | NaN |
Tom | UK | 23.0 |
Tom | UK | 23.0 |
I tried the code below.
np.where((data['Name'] == 'Tom') & (data['Country']=='UK'),data['Value'].fillna(data['Value'].mode()[0]),data['Value'])
CodePudding user response:
I use .eq() .lt() .ge()
etc, to avoid excess ()
everywhere when combining conditions~
Also, since this isn't an if...else
, we can just use pd.DataFrame.mask
instead of np.where
~
I also assumed that you only want the mode
of the same group~ rather than the whole column. If you want the whole column change data.loc[mask, 'Value']
to just data['Value']
mask = (data['Name'].eq('Tom') &
data['Country'].eq('UK'))
data['Value'] = data['Value'].mask(mask & data['Value'].isna(),
data.loc[mask, 'Value'].mode()[0])
Name Country Value
0 Tom USA 30.0
1 Tom UK 20.0
2 Tom UK 23.0
3 nick USA 19.0
4 nick USA 18.0
5 Tom India 23.0
6 nick USA 15.0
7 nick USA 20.0
8 jack India NaN
9 Tom UK 23.0
10 Tom UK 23.0
CodePudding user response:
You don't really need to go as far as masking the data for the mode you can simply try:
data['Value'] = np.where((data['Name'] == 'Tom') & (data['Country']=='UK'),data['Value'].fillna(data['Value'].mode()[0]),data['Value'])
Edit: If you'd like to address only those Null values where the Name is Tom and Country UK, you can add an additional criteria:
data['Value'] = np.where((data['Name'] == 'Tom') & (data['Country']=='UK') & (data['Value'].isna()),data['Value'].mode()[0],data['Value'])
CodePudding user response:
here is one way using group by
and ffill
:
df["Value"] = df.groupby(["Name", "Country"])["Value"].ffill()
output :
Name Country Value
0 Tom USA 20.0
1 Tom UK 20.0
2 Tom UK 20.0
3 nick USA 19.0
4 nick USA 18.0
5 Tom India 23.0
6 nick USA 15.0
7 nick USA 20.0
8 jack India NaN
9 Tom UK 23.0
10 Tom UK 23.0