Home > Software engineering >  How to modify seaborn graph to have percentage information per class?
How to modify seaborn graph to have percentage information per class?

Time:12-07

I am using following code to plot returned/non returned products percentage (y-axis) with respect to percentage of discount given on these products (x-axis). If you see, all bars will add to 100 percent.

sns.histplot(data=df_c, x='percent_discount_100', hue='returned',binwidth=10, stat='percent', ax=ax)
#ax.set_xlim(1,31)
#ax.set_xticks(range(0,100,10))
plt.show()

enter image description here

What I want is to change percentage information on each bar, what I mean is, that for each bar I want to add what percent of the products are returned and not returned (for each bar obviously percent would add to 100%). Could you please tell me how can I approach this problem?

Solution using other libraries such as plotly and matplotlib could also be added.

CodePudding user response:

Seaborn is just a tool to explore data easily, not perfectly. If you want to make figure publishable, use whatever tools necessary. In your case, pandas matplotlib is a better choice.

Without data given, i assume your data is like this (if not, make it to this):

In [33]: df2 = pd.DataFrame(np.random.rand(10, 4), columns=["a", "b", "c", "d"])
In [34]: df2
Out[34]: 
          a         b         c         d
0  0.440042  0.509648  0.863190  0.532108
1  0.087648  0.695300  0.830660  0.468570
2  0.807494  0.195466  0.911627  0.278780
3  0.870929  0.971947  0.997894  0.780992
4  0.205380  0.097973  0.803379  0.100402
5  0.958186  0.362425  0.915435  0.585129
6  0.961905  0.196360  0.080999  0.933527
7  0.785202  0.497949  0.992512  0.518781
8  0.874882  0.610012  0.348986  0.996064
9  0.424615  0.135498  0.931710  0.619083

then, transfer it to percent, in df3, each row sum up to 1.0:

In [46]: df3 = df2.apply(lambda x: x / sum(x) * 100, axis=1)

In [47]: df3
Out[47]: 
           a          b          c          d
0  18.765234  21.733508  36.809987  22.691272
1   4.209449  33.392931  39.893775  22.503845
2  36.815260   8.911695  41.562896  12.710148
3  24.047114  26.836290  27.552722  21.563874
4  17.013872   8.116157  66.552564   8.317407
5  33.964091  12.846603  32.448702  20.740605
6  44.270478   9.037220   3.727897  42.964405
7  28.098676  17.819245  35.517338  18.564740
8  30.915164  21.555626  12.331912  35.197297
9  20.115278   6.418958  44.137918  29.327846

If you just want to see the relative difference,

df3.plot.bar(stacked=True)

df3 stacked bar plot

If you want to add percent above bar, see sns.barplot with multiple='fill'

  • Related