I want to calculate the percentage, for each id and generate a bar plot.
Here an example of my data:
id LGA Status
1 Banyule Referred
2 Hepburn Referred
3 Kingston Not Referred
4 Darebin Not Referred
5 Darebin Managed Externally
6 Darebin Managed Externally
7 Mansfield Managed Externally
8 Casey Referred
9 Mitchell Referred
10 Mitchell Not Referred
11 Moreland Referred
12 Whittlesea Not Referred
13 Glen Eira Not Referred
14 Dandenong Referred
15 Hume Not Referred
16 Hume Managed Externally
17 Campaspe Not Referred
18 Melbourne Not Referred
19 Melbourne Not Referred
I've used "groupby" function to calculate counts for column "LGA" and "Status" and generate a bar plot.
Sample code;
df['Status'].value_counts().plot(kind='bar')
df['LGA'].value_counts().plot(kind='bar')
I am not sure how to do it elegantly if I am interested in plotting percentages for same columns and generate bar plot separately.
Expected output: I have derived below output using excel
Status % of Grand Total
Not Referred 58.42%
Referred 23.68%
Managed Externally 17.89%
Grand Total 100.00%
Expected Bar Chart:
Any help would be appreciated.
CodePudding user response:
I believe this is what you are looking for:
temp_df = (df.groupby('Status').size().sort_values(ascending=False) / df.groupby('Status').size().sort_values(ascending=False).sum())*100
ax = temp_df.plot(kind='bar')
ax.bar_label(ax.containers[0])
plt.show()