Home > Mobile >  Stacked column bar chart over two variables
Stacked column bar chart over two variables

Time:04-08

I have some data as shown below

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

data = {
'gender':['female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'female', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male', 'male'],
'baseline':['M1', 'M1', 'M1', 'M1', 'M1', 'M4', 'M4', 'M2', 'M2', 'M2', 'M2', 'M2', 'M3', 'M3', 'M3', 'M3', 'M5', 'M5', 'M5', 'M5', 'M1', 'M2', 'M3', 'M4', 'M5', 'M2', 'M2', 'M2', 'M3', 'M3', 'M3', 'M3', 'M4', 'M4', 'M4', 'M5', 'M5', 'M5', 'M5', 'M5'],
'endline':['M5', 'M3', 'M1', 'M1', 'M1', 'M4', 'M4', 'M5', 'M2', 'M5', 'M5', 'M3', 'M3', 'M3', 'M4', 'M4', 'M4', 'M1', 'M1', 'M2', 'M5', 'M5', 'M5', 'M1', 'M1', 'M1', 'M1', 'M4', 'M4', 'M4', 'M4', 'M4', 'M3', 'M3', 'M3', 'M3', 'M2', 'M2', 'M2', 'M2']}

df = pd.DataFrame(data)
df.head()

cross_tab_prop = pd.crosstab(index = df['gender'],
                             columns = df['baseline'],
                             normalize = "index")

cross_tab_prop.plot(kind = 'bar', 
                    stacked = True, 
                    colormap = 'tab10', 
                    figsize = (10, 6))

plt.legend(loc = "upper left", ncol = 5)
plt.xlabel("Gender")
plt.ylabel("Proportion")

and would like to produce a chart as shown below

enter image description here

I would appreciate any hints on how to achieve this.

Thanks in advance

CodePudding user response:

With seaborn, the approach would be:

  • convert the dataframe to sns.displot multiple='fill'

    Here is the same plot with different styling:

    import matplotlib.pyplot as plt
    from matplotlib.ticker import PercentFormatter, MultipleLocator
    import seaborn as sns
    import pandas as pd
    
    # df_long = ...
    sns.set_style('whitegrid')
    g = sns.displot(data=df_long, x='which', hue='property', col='gender', multiple='fill', shrink=0.7, palette='turbo')
    g.set(xlabel='', ylabel='')
    g.axes[0, 0].yaxis.set_major_locator(MultipleLocator(.1))
    g.axes[0, 0].yaxis.set_major_formatter(PercentFormatter(1))
    g.axes[0, 0].set_xlim(-.6, 1.6)
    sns.despine(left=True)
    plt.subplots_adjust(wspace=0)
    

    sns.displot looking like one subplot

  • Related