Home > OS >  Stacked barplot inside a bar plot python
Stacked barplot inside a bar plot python

Time:01-22

I have the following barplot. It is showing the distribution of the letters in my dataset (x) in percentage (y). Inside this barplot I want to add that, for example, the 10% of L is 'male' and the 60% is female,10% is neutral,10% is other and 10% missing. For all the letters, like the second attached plot; meaning: in all the L analyzed the 10% is male etc. A stacked barplot inside a barplot, maybe using the female etc percentage label inside the bar since it is on a different scale (each letters sum to 100%). How can I do that in python? Thanks a lot!

enter image description here

enter image description here The reproducible code:

data=
{'L': 0.10128343899798979,
 'A': 0.04587392402453482,
 'G': 0.05204199096266515,
 'V': 0.08343212549181313,
 'E': 0.07848392694534645,
 'S': 0.03242100922632854,
 'I': 0.05353675927357696,
 'K': 0.07614727763173719,
 'R': 0.0878305241997835,
 'D': 0.05932683882274109,
 'T': 0.06166348813635036,
 'P': 0.033915777537240344,
 'N': 0.04120062539731629,
 'Q': 0.03858907616445887,
 'F': 0.033073896534542895,
 'Y': 0.04503204302183736,
 'M': 0.018126213425424805,
 'H': 0.04008384447537069,
 'C': 0.0014947683109118087,
 'W': 0.016442451420029897}

import matplotlib.pyplot as plt
plt.bar(range(len(data)), list(data.values()), align='center')
plt.xticks(range(len(data)), list(data.keys()))

#stacked bar plot data subset
index,female,male,neutral,other,missing
L,0.40816326530612246,0.30612244897959184,0.02040816326530612,0.0,0.2653061224489796
A,0.34615384615384615,0.34615384615384615,0.0,0.0,0.3076923076923077
G,0.2962962962962963,0.1111111111111111,0.037037037037037035,0.0,0.5555555555555556
V,0.20833333333333334,0.5625,0.020833333333333332,0.0,0.20833333333333334
E,0.5,0.225,0.025,0.0,0.25

CodePudding user response:

You can multiply the dataframe with the detail by the totals. And from there create a stacked bar plot.

from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
from io import StringIO

df_details_str = '''index,female,male,neutral,other,missing
L,0.40816326530612246,0.30612244897959184,0.02040816326530612,0.0,0.2653061224489796
A,0.34615384615384615,0.34615384615384615,0.0,0.0,0.3076923076923077
G,0.2962962962962963,0.1111111111111111,0.037037037037037035,0.0,0.5555555555555556
V,0.20833333333333334,0.5625,0.020833333333333332,0.0,0.20833333333333334
E,0.5,0.225,0.025,0.0,0.25'''

data = {'L': 0.10128343899798979, 'A': 0.04587392402453482, 'G': 0.05204199096266515, 'V': 0.08343212549181313, 'E': 0.07848392694534645, 'S': 0.03242100922632854, 'I': 0.05353675927357696, 'K': 0.07614727763173719, 'R': 0.0878305241997835, 'D': 0.05932683882274109, 'T': 0.06166348813635036, 'P': 0.033915777537240344, 'N': 0.04120062539731629, 'Q': 0.03858907616445887, 'F': 0.033073896534542895, 'Y': 0.04503204302183736, 'M': 0.018126213425424805, 'H': 0.04008384447537069, 'C': 0.0014947683109118087, 'W': 0.016442451420029897}
df_total = pd.DataFrame(pd.Series(data, name='total'))
df_details = pd.read_csv(StringIO(df_details_str), index_col=0)


df_multiplied = df_details.multiply(df_total['total'], axis='index').loc[df_details.index]
df_multiplied.plot.bar(stacked=True, rot=0, cmap='Set3')
plt.xlabel('')
plt.tight_layout()
plt.show()

stacked bar plot from multiplied dataframes

CodePudding user response:

Same idea with enter image description here

  • Related