I'm trying to understand the pointplot function (Link to pointplot doc) to plot error bars.
Setting the 'errorbar' argument to 'sd' should plot the standard deviation along with the mean. But calculating the standard deviation manually results in a different value.
I used the example provided in the documentation:
import seaborn as sns
df = sns.load_dataset("penguins")
ax = sns.pointplot(data=df, x="island", y="body_mass_g", errorbar="sd")
data = ax.lines[1].get_ydata()
print(data[1] - data[0]) # prints 248.57843137254895
sd = df[df['island'] == 'Torgersen']['body_mass_g'].std()
print(sd) # prints 445.10794020256765
I expected both printed values to be the same, since both data[1] - data[0]
and sd
should be equal to the standard deviation of the variable 'body_mass_g' for the category 'Torgersen'. Other standard deviation provided by sns.pointplot are also not as expected.
I must be missing something obvious here but for the life of me I can't figure it out. Appreciate any help. I tested the code locally and in google colab with the same results.
CodePudding user response:
My PC had an outdated version of seaborn (0.11.2), where the argument 'errorbar' was named 'ci'. Using the correct argument resolves the problem. Strangly google Colab also uses version 0.11.2, contrary to their claim that they auto update their packages.