Home > Blockchain >  Categorize and order bar chart by Hue
Categorize and order bar chart by Hue

Time:05-06

I have a problem. I want to show the two highest countries of each category. But unfortunately I only get the below output. However, I would like the part to be listed as an extra category. Is there an option?

import pandas as pd
import seaborn as sns
d = {'count': [50, 20, 30, 100, 3, 40, 5],
     'country': ['DE', 'CN', 'CN', 'BG', 'PL', 'BG', 'RU'],
     'part': ['b', 'b', 's', 's', 'b', 's', 's']
    }
df = pd.DataFrame(data=d)
print(df)

#print(df.sort_values('count', ascending=False).groupby('party').head(2))

ax = sns.barplot(x="country", y="count", hue='part',
                 data=df.sort_values('count', ascending=False).groupby('part').head(2), palette='GnBu')

What I got

enter image description here

What I want

enter image description here

CodePudding user response:

You can always not use seaborn and plot everything in matplotlib directly.

from  matplotlib import pyplot as plt
import pandas as pd

plt.style.use('seaborn')

df = pd.DataFrame({
    'count': [50, 20, 30, 100, 3, 40, 5],
    'country': ['DE', 'CN', 'CN', 'BG', 'PL', 'BG', 'RU'],
    'part': ['b', 'b', 's', 's', 'b', 'b', 's']
})

fig, ax = plt.subplots()
offset = .2
xticks, xlabels = [], []
handles, labels = [], []

for i, (idx, group) in enumerate(df.groupby('part')):
    plot_data = group.nlargest(2, 'count')
    x = [i - offset, i   offset]
    barcontainer = ax.bar(x=x, height=plot_data['count'], width=.35)
    
    xticks  = [i - offset, i   offset]
    xlabels  = plot_data['country'].tolist()
    handles.append(barcontainer[0])
    labels.append(idx)

ax.set_xticks(xticks)
ax.set_xticklabels(xlabels)
ax.legend(handles=handles, labels=labels, title='Part')
plt.show()

enter image description here

CodePudding user response:

The following approach creates a FacetGrid for your data. Seaborn 11.2 introduced the helpful g.axes_dict. (In the example data I changed the second entry for 'BG' to 'b', supposing that each country/part combination only occurs once, as in the example plots).

from  matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd

d = {'count': [50, 20, 30, 100, 3, 40, 5],
     'country': ['DE', 'CN', 'CN', 'BG', 'PL', 'BG', 'RU'],
     'part': ['b', 'b', 's', 's', 'b', 'b', 's']
     }
df = pd.DataFrame(data=d)
sns.set()
g = sns.FacetGrid(data=df, col='part', col_wrap=2, sharey=True, sharex=False)
for part, df_part in df.groupby('part'):
     order = df_part.nlargest(2, 'count')['country']
     ax = sns.barplot(data=df_part, x='country', y='count', order=order, palette='summer', ax=g.axes_dict[part])
     ax.set(xlabel=f'part = {part}')
g.set_ylabels('count')
plt.tight_layout()
plt.show()

sns.FacetGrid with customly ordered bar plots

  • Related