I have a data frame that looks something like this and I want am trying to make two grouped bar charts where one is grouped by sex and it shows the counts for the disease for males and females:
Age | Sex | disease |
---|---|---|
23 | m | 0 |
43 | m | 1 |
32 | f | 0 |
51 | f | 1 |
29 | m | 1 |
This is what ive tried but it shows every datapoint instead of the total counts.
heart_failure_df.set_index(['Sex']).plot(y='disease', kind="bar")
This is ideally what i want:
CodePudding user response:
Chart:
Code:
import pandas as pd
import matplotlib.pyplot as plt
d = {'Age': [23, 43, 32, 51, 29],
'Sex': ['m', 'm', 'f', 'f', 'm'],
'disease': [0, 1, 0, 1, 1]}
df = pd.DataFrame(data=d)
df_pivot = pd.pivot_table(
df,
index="Sex",
columns="disease",
values="disease",
aggfunc='count')
fig, ax = plt.subplots()
df_pivot.plot(kind="bar", ax=ax)
ax.legend(["Normal", "Heart Disease"]);
ax.tick_params(axis='x', rotation=0)
ax.set_ylabel("count")
- Pivot puts the dataframe in the correct format to be plotted as grouped bar chart by
pd.plot()
and counts the occurrences of disease in one step. - The legend needs to be set manually since in your data
disease
is '0', '1'.
CodePudding user response:
First formulate your data
s = df.groupby('Sex')['Disease'].count()
After that slice it a little bit
Only males data s['M']
Only females data s['F']
This shall return the series with the wanted y axis and x axis. After that just a simple plot shall do your trick
Just take a look at the documentation (https://matplotlib.org/stable/gallery/lines_bars_and_markers/barchart.html)