Home > Back-end >  How to create a stacked barchart for a large dataset in Python?
How to create a stacked barchart for a large dataset in Python?

Time:01-05

For research purposes at my university, I need to create a stacked bar chart for speech data. I would like to represent the hours of speech on the y-axis and the frequency on the x-axis. The speech comes from different components, hence the stacked part of the chart. The data resides in a Pandas dataframe, which has a lot of columns, but the important ones are "component", "hours" and "ps_med_frequency" which are used in the graph.

A simplified view of the DF (it has 6.2k rows and 120 columns, a-k components):

component filename ps_med_freq (rounded to integer) hours (length) ...
a fn0001_ps 230 0.23
b fn0002_ps 340 0.12
c fn003_ps 278 0.09

I have already tried this with matplotlib, seaborn or just the plot method from the Pandas dataframe itself. None seem to work properly.

A snippet of seaborn code I have tried:

sns.barplot(data=meta_dataframe, x='ps_med_freq', y='hours', hue='component', dodge=False)

And basically all variations of this as well.

Below you can see one of the most "viable" results I've had so far: sns.histplot using weights

  • Related