I have the following pandas dataframe:
df = {'Age Group': ['0-9', '10-24', '25-44', '45-64', '65 '], 'count': [66, 93, 143, 133, 122]}
I want to create a histogram showing the distribution of age groups using matplotlib:
plt.hist(male['count'], density=True, bins=30) # density=False would make counts
plt.ylabel('Frequency')
plt.xlabel('Age Group');
Why are all the frequency values here equal? I can't seem to figure out how to plot this correctly, any help would be appreciated!
CodePudding user response:
This is a common misconception. A histogram (hist) requires a raw (i.e. non aggregated) dataset to plot (there is effectively only one row for each group in your dataset).
Here you already have aggregates so you need a barplot.
plt.bar(df['Age Group'], df['count'])
output:
CodePudding user response:
plt.hist(male['count'], ...)
plots the histogram of male['count']
, which is the array/list [66, 93, 143, 133, 122]
. Each number appears there exactly once, so the histogram shows a uniform distribution.