I have a dataset that looks like this (assume this has 4 categories in Clicked
, the head(10)
only showed 2 categories):
Rank Clicked
0 2.0 Cat4
1 2.0 Cat4
2 2.0 Cat4
3 1.0 Cat1
4 1.0 Cat4
5 2.0 Cat4
6 2.0 Cat4
7 3.0 Cat4
8 5.0 Cat4
9 5.0 Cat4
This is a code that returns this plot:
eee = (df.groupby(['Rank','Clicked'])['Clicked'].count()/df.groupby(['Rank'])['Clicked'].count())
eee.unstack().plot.bar(stacked=True)
plt.legend(['Cat1','Cat2','Cat3','Cat4'])
plt.xlabel('Rank')
Is there a way to achieve this with seaborn (or matplotlib) instead of the pandas plotting capabilities? I tried a few ways, both of running the seaborn code and of preprocessing the dataset so it's on the correct format, with no luck.
CodePudding user response:
e.g.
tips = sns.load_dataset("tips")
sns.histplot(
data=tips,
x="size", hue="day",
multiple="fill", stat="proportion",
discrete=True, shrink=.8
)
CodePudding user response:
Seaborn doesn't support stacked barplot, so you need to plot the cumsum:
# calculate the distribution of `Clicked` per `Rank`
distribution = pd.crosstab(df.Rank, df.Clicked, normalize='index')
# plot the cumsum, with reverse hue order
sns.barplot(data=distribution.cumsum(axis=1).stack().reset_index(name='Dist'),
x='Rank', y='Dist', hue='Clicked',
hue_order = distribution.columns[::-1], # reverse hue order so that the taller bars got plotted first
dodge=False)
Output:
Preferably, you can also reverse the cumsum direction, then you don't need to reverse hue order:
sns.barplot(data=distribution.iloc[:,::-1].cumsum(axis=1) # we reverse cumsum direction here
.stack().reset_index(name='Dist'),
x='Rank', y='Dist', hue='Clicked',
hue_order=distribution.columns, # forward order
dodge=False)
Output: