Sorting by two values and keeping the two highest values-CodePudding

I have a problem. I want to create a chart. I want to show the difference between buyer b and seller s. But I want to show only the first 2 countries. Is there an option where I can filter for b and s and get the highest 2?

Dataframe

   count country part
0     50      DE    b
1     20      CN    b
2     30      CN    s
3    100      BG    s
4      3      PL    b
5     40      BG    b
6      5      RU    s

Code

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
d = {'count': [50, 20, 30, 100, 3, 40, 5],
     'country': ['DE', 'CN', 'CN', 'BG', 'PL', 'BG', 'RU'],
     'part': ['b', 'b', 's', 's', 'b', 'b', 's']
    }
df = pd.DataFrame(data=d)


df_consignee_countries['party'] = 'consignee'
df_orders_countries['party'] = 'buyer'
df_party = pd.concat([df_consignee_countries, df_orders_countries], join="outer")

ax = sns.barplot(x="country", y="count", hue='part', data=df_party, palette='GnBu')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
for p in ax.patches:
    ax.annotate(format(p.get_height(), '.1f'), 
                   (p.get_x()   p.get_width() / 2., p.get_height()), 
                   ha = 'center', va = 'center', 
                   xytext = (0, 9), 
                   textcoords = 'offset points')

What I want

   count country part
0     50      DE    b
2     30      CN    s
3    100      BG    s
5     40      BG    b

CodePudding user response：

You first need to sort your values and then take 2 rows per group:

>>> df.sort_values('count', ascending=False).groupby('part').head(2)
   count country part
3    100      BG    s
0     50      DE    b
5     40      BG    b
2     30      CN    s

CodePudding user response：

Do you want this ? :

df_b=df.loc[df['part']=='b',:].sort_values(by='count',ascending=False).head(n=2)
df_b=df_b.append(df.loc[df['part']=='s',:].sort_values(by='count',ascending=False).head(n=2))
df_b