Home > front end >  How to plot my data using proportions and violin plots?
How to plot my data using proportions and violin plots?

Time:04-19

Let's say I have people chew a type of gum while reading a question, and then answer a test question. Sometimes they would chew orange gum while reading and answering a question. Sometimes they would chew peppermint. Not everyone chewed and answered all of the questions.

Let's say I have my data laid out like this:

ID Gum Type Test (1= correct, 2=incorrect)
1 Orange 1
1 Orange 0
1 Peppermint 0
1 Peppermint 1
2 Orange 0
2 Peppermint 1

I want to create a violin plot where on my x-axis, I have Gum Type, and on my Y-axis, I have the Proportion correct on the test, and participant 1 would show up as only one data point for Orange, and One data point for Peppermint. So participant one would show up on the "Orange" violin plot as one data point, in the middle (got 50% of orange questions correct).

CodePudding user response:

Use:

data = '''ID    Gum Type    Test (1= correct, 2=incorrect)
1   Orange  1
1   Orange  0
1   Peppermint  0
1   Peppermint  1
2   Orange  0
2   Peppermint  1'''
data = [x.split('   ') for x in data.split('\n')]
import seaborn as sns

df = pd.DataFrame(data[1:], columns = data[0])
df['Test (1= correct, 2=incorrect)'] = df['Test (1= correct, 2=incorrect)'].astype(int)
df1 = df.groupby(['ID', 'Gum Type'])['Test (1= correct, 2=incorrect)'].mean().to_frame().reset_index()
ax = sns.violinplot(x="Gum Type", y="Test (1= correct, 2=incorrect)", data=df1)

Output:

enter image description here

  • Related