Home > Software engineering >  GroupBy pandas DataFrame and select most common value which is alphabetically first
GroupBy pandas DataFrame and select most common value which is alphabetically first

Time:03-24

I have a data frame with multiple columns but currently, I am focusing on province and city columns. The problem is I want to fill NaN values in the city column by the most frequent city value in the same province, but if the value count of any city is the same then it should fill with the one which is alphabetically first. I followed an answer from this link and modified it, but values are not being sorted alphabetically for some reason.

Here's my code:

df.groupby(['province'])['city'].agg(lambda x:x.value_counts().sort_values(ascending=False).index[0])

what do I need to change?

Sample DataFrame:

{'province': {0: 'A',
  1: 'A',
  2: 'A',
  3: 'A',
  4: 'A',
  5: 'A',
  6: 'A',
  7: 'B',
  8: 'B',
  9: 'B',
  10: 'B',
  11: 'B',
  12: 'B',
  13: 'B'},
 'city': {0: 'newyork',
  1: nan,
  2: 'newyork',
  3: 'london',
  4: 'london',
  5: nan,
  6: 'houston',
  7: 'hyderabad',
  8: 'karachi',
  9: nan,
  10: 'hyderabad',
  11: nan,
  12: 'beijing',
  13: 'karachi'}}

Required Output:

{'province': {0: 'A',
  1: 'A',
  2: 'A',
  3: 'A',
  4: 'A',
  5: 'A',
  6: 'A',
  7: 'B',
  8: 'B',
  9: 'B',
  10: 'B',
  11: 'B',
  12: 'B',
  13: 'B'},
 'city': {0: 'newyork',
  1: 'london',
  2: 'newyork',
  3: 'london',
  4: 'london',
  5: 'london',
  6: 'houston',
  7: 'hyderabad',
  8: 'karachi',
  9: 'hyderabad',
  10: 'hyderabad',
  11: 'hyderabad',
  12: 'beijing',
  13: 'karachi'}}

CodePudding user response:

Try with groupby and mode:

mapper = df.groupby("province")["city"].agg(lambda x: x.mode().sort_values()[0]).to_dict()
df["city"] = df["city"].where(df["city"].notnull(),
                              df["province"].map(mapper))

>>> df
   province       city
0         A    newyork
1         A     london
2         A    newyork
3         A     london
4         A     london
5         A     london
6         A    houston
7         B  hyderabad
8         B    karachi
9         B  hyderabad
10        B  hyderabad
11        B  hyderabad
12        B    beijing
13        B    karachi
  • Related