I seem to be having some issues with turning this table (which was generated from a data frame using the group by function) into the graph I need it to be. The table consists of 10 questions that users were asked to answer on a scale of 1 to 5 and the average answer for each gender.
Gender | Q1 | Q2 ....... |
---|---|---|
0(male) | 3.7 | 1.3 |
1(Female) | 2.8 | 3.1 |
2(Other) | 4.2 | 4.7 |
It looks something like this table except it carries on for all 10 questions. Currently when I am using the .plot function with the name of this table it is plotting the incorrect graph. It is plotting this graph:
When instead of this graph I want the 3 genders to be represented by 3 different bars for each question. Then each of the 10 questions would go along the x-axis. Any help on how to achieve this would be greatly appreciated.
CodePudding user response:
It is unclear how your data is organized, nor how you created the table.
Supposing the input data has 3 columns: 'gender', 'question', 'mark'. Here is some data in that form, to be able to test:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({'gender': np.random.choice(['male', 'female', 'other'], 1000),
'question': pd.Categorical.from_codes(codes=np.random.randint(0, 10, 1000),
categories=[f'Q{i}' for i in range(1, 11)]),
'mark': np.random.randint(1, 6, 1000)})
df['gender'] = pd.Categorical(df['gender'], categories=['male', 'female', 'other']) # fix an order
You can then create a table via df.pivot_table(index='gender', columns='question', values='mark')
and plot it as bars, resembling the image of the question:
df.pivot_table(index='gender', columns='question', values='mark').plot.bar(rot=0, width=0.8)
Now, creating the pivot_table the other way around would place the questions as the x-positions and the gender as coloring:
df.pivot_table(index='question', columns='gender', values='mark').plot.bar(rot=0, width=0.8)
If your table is already created, you can transpose it before plotting, also resulting in the questions for the x-axis:
df_table = df.pivot_table(index='gender', columns='question', values='mark')
df_table.T.plot.bar(rot=0, width=0.8)
An alternative is to use seaborn, which creates the barplot directly from the original dataframe, and (optionally) shows error bars (use ci=None
to suppress the error bars). You just tell seaborn what you want to see on the x-axis, what on the y-axis, and what for the colored separation.
import seaborn as sns
ax = sns.barplot(data=df, x='question', y='mark', hue='gender')
ax.legend(bbox_to_anchor=[1.01, 1.01], loc='upper left')
plt.tight_layout()
plt.show()
Now, if your dataframe looks like:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({'Gender': np.arange(3),
**{f'Q{i}': np.random.randint(15, 46, 3) / 10 for i in range(1, 11)}})
For example:
Gender Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
0 0 3.9 2.7 2.2 2.6 2.2 2.8 2.2 3.2 2.9 2.9
1 1 4.2 2.0 4.4 2.4 3.6 2.1 1.6 4.0 2.7 4.2
2 2 3.7 3.9 4.2 2.9 3.2 4.4 4.2 2.1 2.9 3.6
You can make the 'Gender' column categorical with proper names, use it as index, transpose the dataframe and plot:
df['Gender'] = pd.Categorical.from_codes(df['Gender'], categories=['Male', 'Female', 'Other'])
df.set_index('Gender').T.plot.bar(rot=0, width=0.8)