I have a data set (my_data) that looks something like this:
Gender Time Money Score
Female 23 14 26.74
Male 12 98 56.76
Male 11 34 53.98
Female 18 58 25.98
etc.
I want to make a grouped box plot of gender against score, so that there will be two plots in the same graph.
My code so far:
Males = [my_data.loc[my_data['Gender']=='Male', 'Score']]
Females = [my_data.loc[my_data['Gender']=='Female', 'Score']]
Score = [Males, Females]
fig, ax = plt.subplots()
ax.boxplot(Score)
plt.show()
However, this runs an error message:
ValueError: X must have 2 or fewer dimensions
I tried converting Males and Females to an array, thinking maybe Python wasn't liking it as a list by doing:
Males = np.array([my_data.loc[my_data['Gender']=='Male', 'Score']])
Females = np.array([my_data.loc[my_data['Gender']=='Female', 'Score']])
But that still didn't work. Also Python does say it takes lists as values for boxplots so I shouldn't need to do that anyway.
I also tried a different way of making a boxplot like this:
fig = plt.figure(figsize =(10, 7))
ax = fig.add_axes([Males, Females])
bp = ax.boxplot(Score)
plt.show()
And it gave me this error code:
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
<Figure size 720x504 with 0 Axes>
What's going on?
CodePudding user response:
When you extracted your source data, you put them unnecessary in square brackets.
Generate them instead as:
Males = my_data.loc[my_data['Gender']=='Male', 'Score']
Females = my_data.loc[my_data['Gender']=='Female', 'Score']
Then, to generate your box plot, you can run e.g.:
fig, ax = plt.subplots(1, 1)
ax.boxplot([Males, Females])
ax.set_xticklabels(['Males', 'Females'])
plt.show()