Home > other >  Missing value replacemnet using mode in pandas in subgroup of a group
Missing value replacemnet using mode in pandas in subgroup of a group

Time:08-07

Having a data set as below.Here I need to group the subset in column and fill the missing values using mode method.Here specifically needs to fill the missing value of Tom from UK. So I need to group the TOM from Uk, and in that group the most repeating value needs to be added to the missing value.

datammain

Below fig shows how i need to do the group by.From the below matrix i need to replace all the Nan values using mode.

new

the desired output:

output

attaching the dataset

Name location Value
Tom  USA      20
Tom  UK       Nan
Tom  USA      Nan
Tom  UK       20
Jack India    Nan
Nihal Africa  30
Tom   UK      Nan
Tom   UK      20
Tom   UK      30
Tom   UK      20
Tom   UK      30
Sam   UK      30
Sam   UK      30

CodePudding user response:

try:

df = df\
    .set_index(['Name', 'location'])\
    .fillna(
        df[df.Name.eq('Tom') & df.location.eq('UK')]\
            .groupby(['Name', 'location'])\
            .agg(pd.Series.mode)\
            .to_dict()
    )\
    .reset_index()

Output:

     Name location Value
0     Tom      USA    20
1     Tom       UK    20
2     Tom      USA   NaN
3     Tom       UK    20
4    Jack    India   NaN
5   Nihal   Africa    30
6     Tom       UK    20
7     Tom       UK    20
8     Tom       UK    30
9     Tom       UK    20
10    Tom       UK    30
11    Sam       UK    30
12    Sam       UK    30
  • Related