Home > Net >  Histogram of grouped data outputting equal frequencies
Histogram of grouped data outputting equal frequencies

Time:04-12

I have the following pandas dataframe:

df = {'Age Group': ['0-9', '10-24', '25-44', '45-64', '65 '], 'count': [66, 93, 143, 133, 122]} 

enter image description here

I want to create a histogram showing the distribution of age groups using matplotlib:

plt.hist(male['count'], density=True, bins=30)  # density=False would make counts
plt.ylabel('Frequency')
plt.xlabel('Age Group');

enter image description here

Why are all the frequency values here equal? I can't seem to figure out how to plot this correctly, any help would be appreciated!

CodePudding user response:

This is a common misconception. A histogram (hist) requires a raw (i.e. non aggregated) dataset to plot (there is effectively only one row for each group in your dataset).

Here you already have aggregates so you need a barplot.

plt.bar(df['Age Group'], df['count'])

output:

enter image description here

CodePudding user response:

plt.hist(male['count'], ...) plots the histogram of male['count'], which is the array/list [66, 93, 143, 133, 122]. Each number appears there exactly once, so the histogram shows a uniform distribution.

  • Related