I am trying to create a boxplot, where on the x-axis I will refer to the two columns of the dataframe, while on y-axis I will show values of the 3rd column.
Let me refer to an example dataframe:
Lvl1 Lvl2 value
0 A 1 1
1 A 2 2
2 A 1 3
3 B 2 4
4 B 1 5
5 B 2 6
Now, I want to have boxplots for the groups corresponding to Lvl1 and Lvl2. For example for group represented by (Lvl1 = A, Lvl2 = 1) boxplot would be calculated of values of {1,3}.
I know I can create a new column like Lvl0 which would be something like Lvl1 Lvl2, but is there a way to create a boxplot without such operation?
On the following code:
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.DataFrame(
{'Lvl1': ['A', 'A', 'A', 'B', 'B', 'B'], 'Lvl2': [1, 2, 1, 2, 1, 2], 'value': [1, 2, 3, 4, 5, 6]})
grouped = dataset.groupby(['Lvl1', 'Lvl2'])
grouped.boxplot()
plt.show()
I get an error:
KeyError: "None of [Index(['A', 1], dtype='object')] are in the [index]"
Thank you in advance!
CodePudding user response:
Try to use seaborn for an easier solution. I think it was answered here:
CodePudding user response:
You can do it through seaborn. Following code works for me on your data:
import pandas as pd
import seaborn as sns
dataset = pd.DataFrame(
{
'Lvl1': ['A', 'A', 'A', 'B', 'B', 'B'], 'Lvl2': [1, 2, 1, 2, 1, 2],
'value': [1, 2, 3, 4, 5, 6]
}
)
ax = sns.boxplot(x='Lvl1', y='value', hue="Lvl2", data=dataset)
Expired output: