I have been struggling to create a horizontal bar chart using python, pandas, and matplotlib.
Ideally I'd like to have percentages on the chart as well.
I merged 2 datasets together to get this:
# dataframe
df = pd.DataFrame({'Moving Violation': [103281, 75376, 66957, 73071, 244090],
'Other Violations': [54165, 75619, 48567, 33587, 127639]},
index=['asian/pacific islander', 'black', 'hispanic', 'other', 'white'])
Moving Violation Other Violations
asian/pacific islander 103281 54165
black 75376 75619
hispanic 66957 48567
other 73071 33587
white 244090 127639
I am now looking to create a stacked bar chart that looks something like this:
I am struggling to figure this out. I have used plt.barh()
, but it has deemed to not work very well.
CodePudding user response:
You can do it through temporary df with percentage proportions of your data as in the example below. Dataframe I created has different data, but is the same type as yours.
# df similar to what you have
df = pd.DataFrame(index=["a", "b", "c"], data={"aa": [1,2,3], "bb": [5,6,7]})
# percentages
df_perc = df.iloc[:, :].apply(lambda x: x / x.sum(), axis=1) * 100
# plot
df_perc.plot(ax=ax, kind="barh", stacked=True)
DataFrame Views
df
Moving Violation Other Violations
asian/pacific islander 103281 54165
black 75376 75619
hispanic 66957 48567
other 73071 33587
white 244090 127639
totals
asian/pacific islander 157446
black 150995
hispanic 115524
other 106658
white 371729
dtype: int64
percent
Moving Violation Other Violations
asian/pacific islander 65.60 34.40
black 49.92 50.08
hispanic 57.96 42.04
other 68.51 31.49
white 65.66 34.34