Home > Software design >  seaborn: barplot of a dataframe by group
seaborn: barplot of a dataframe by group

Time:12-13

I am having difficulty with this. I have the results from my initial model (`Unfiltered´), that I plot like so:

df = pd.DataFrame(
    {'class': ['foot', 'bike', 'bus', 'car', 'metro'],
     'Precision': [0.7, 0.66, 0.41, 0.61, 0.11],
     'Recall': [0.58, 0.35, 0.13, 0.89, 0.02],
     'F1-score': [0.64, 0.45, 0.2, 0.72, 0.04]}
)

groups = df.melt(id_vars=['class'], var_name=['Metric'])
sns.barplot(data=groups, x='class', y='value', hue='Metric')

To produce this nice plot: enter image description here

Now, I obtained a second results from my improved model (filtered), so I add a column (status) to my df to indicate the results from each model like this:

df2 = pd.DataFrame(
    {'class': ['foot','foot','bike','bike','bus','bus',
               'car','car','metro','metro'],
 'Precison': [0.7, 0.62, 0.66, 0.96, 0.41, 0.42, 0.61, 0.75, 0.11, 0.3],
 'Recall': [0.58, 0.93, 0.35, 0.4, 0.13, 0.1, 0.89, 0.86, 0.02, 0.01],
 'F1-score': [0.64, 0.74, 0.45, 0.56, 0.2, 0.17, 0.72, 0.8, 0.04, 0.01],
 'status': ['Unfiltered', 'Filtered', 'Unfiltered','Filtered','Unfiltered',
           'Filtered','Unfiltered','Filtered','Unfiltered','Filtered']}
)

df2.head()
    class  Precison  Recall  F1-score   status
0   foot    0.70      0.58    0.64     Unfiltered
1   foot    0.62      0.93    0.74     Filtered
2   bike    0.66      0.35    0.45     Unfiltered
3   bike    0.96      0.40    0.56     Filtered
4   bus     0.41      0.13    0.20     Unfiltered

And I want to plot this, in similar grouping as above (i.e. foot, bike, bus, car, metro). However, for each of the metrics, I want to place the two values side-by-side. Take for example, the foot group, I would have two bars Precision[Unfiltered, filtered], then 2 bars for Recall[Unfiltered, filtered] and also 2 bars for F1-score[Unfiltered, filtered]. Likewise all other groups.

My attempt:

group2 = df2.melt(id_vars=['class', 'status'], var_name=['Metric'])
sns.barplot(data=group2, x='class', y='value', hue='Metric')

enter image description here

Totally not what I want.

CodePudding user response:

You can pass in hue any sequence as long as it has the same length as your data, and assign colours through it. So you could try with

group2 = df2.melt(id_vars=['class', 'status'], var_name=['Metric'])
sns.barplot(data=group2, x='class', y='value', hue=group2[['Metric','status']].agg(tuple, axis=1))
plt.legend(fontsize=7)

But the result is a bit hard to read: enter image description here

CodePudding user response:

Seaborn grouped barplots don't allow for multiple grouping variables. One workaround is to recode the two grouping variables (Metric and status) as one variable with 6 levels. Another possibility is to use facets. If you are open to another plotting package, I might recommend enter image description here

  • Related