Home > Net >  Python Pandas groupby and sort along multiple columns
Python Pandas groupby and sort along multiple columns

Time:12-13

I was playing with pandas groupby function, and there is something I can't manage to achieve.

My data is like :

   data = ({
    'Color1':["Blue", "Red", "Green", "Blue", "Red", "Green", "Blue", "Red", "Green"],
    'Color2':["Purple", "Pink", "Yellow", "Purple", "Pink", "Yellow", "Brown", "White", "Grey"],
    'Value':[20, 20, 20, 25, 25, 25, 5, 55, 30]
})

df = pd.DataFrame(data)

I used the groupby to do some sorting (the idea behind is to extract some top N from larger datasets)

df2 = df.groupby(['Color1'], sort=True).sum()[['Value']].reset_index()
df2 = df2.sort_values(by=['Value'], ascending=False)
print(df2)

Color1 Value 2 Red 100 1 Green 75 0 Blue 50

But my biggest concern is how to groupby and sort adding Color2 while preserving the sort on Color 1 i.e. a result such as :

  Color1  Color2  Value
0    Red   White     55
1    Red    Pink     45
2  Green  Yellow     45
3  Green    Grey     30
4   Blue  Purple     45
5   Blue   Brown      5

Thanks a lot for your help

CodePudding user response:

Try:

>>> df.groupby(['Color1', 'Color2']).sum() \
      .sort_values(['Color1', 'Value'], ascending=False).reset_index()

  Color1  Color2  Value
0    Red   White     55
1    Red    Pink     45
2  Green  Yellow     45
3  Green    Grey     30
4   Blue  Purple     45
5   Blue   Brown      5

CodePudding user response:

Problem is values are strings, so sum join values instead summing.

Need convert column to numeric:

df = pd.DataFrame(data)
df['Value'] = df['Value'].astype(int)
df2 = df.groupby(['Color1','Color2'], sort=False)['Value'].sum().reset_index()

df2 = df2.sort_values(by=['Value'], ascending=False)

If need sorting by Color1, Color2 with original order in Color1 use ordered Categoricals:

vals = df2['Color1'].unique()
df2['Color1'] = pd.Categorical(df2['Color1'], ordered=True, categories=vals)

df2 = df2.sort_values(['Color1','Color2'])
print(df2)

  Color1  Color2  Value
1    Red    Pink     45
4    Red   White     55
3   Blue   Brown      5
0   Blue  Purple     45
5  Green    Grey     30
2  Green  Yellow     45
  • Related