I have a problem. I want to create a chart. I want to show the difference between buyer b
and seller s
. But I want to show only the first 2 countries. Is there an option where I can filter for b
and s
and get the highest 2?
Dataframe
count country part
0 50 DE b
1 20 CN b
2 30 CN s
3 100 BG s
4 3 PL b
5 40 BG b
6 5 RU s
Code
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
d = {'count': [50, 20, 30, 100, 3, 40, 5],
'country': ['DE', 'CN', 'CN', 'BG', 'PL', 'BG', 'RU'],
'part': ['b', 'b', 's', 's', 'b', 'b', 's']
}
df = pd.DataFrame(data=d)
df_consignee_countries['party'] = 'consignee'
df_orders_countries['party'] = 'buyer'
df_party = pd.concat([df_consignee_countries, df_orders_countries], join="outer")
ax = sns.barplot(x="country", y="count", hue='part', data=df_party, palette='GnBu')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
for p in ax.patches:
ax.annotate(format(p.get_height(), '.1f'),
(p.get_x() p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points')
What I want
count country part
0 50 DE b
2 30 CN s
3 100 BG s
5 40 BG b
CodePudding user response:
You first need to sort your values and then take 2 rows per group:
>>> df.sort_values('count', ascending=False).groupby('part').head(2)
count country part
3 100 BG s
0 50 DE b
5 40 BG b
2 30 CN s
CodePudding user response:
Do you want this ? :
df_b=df.loc[df['part']=='b',:].sort_values(by='count',ascending=False).head(n=2)
df_b=df_b.append(df.loc[df['part']=='s',:].sort_values(by='count',ascending=False).head(n=2))
df_b