Home > other >  seaborn log scale violin plot lower whisker issue
seaborn log scale violin plot lower whisker issue

Time:05-09

I am using seaborn to create a violin plot of my dataset which are five intervals containing 100 values each. These values vary a lot ranging from 1 to a 2622873. To make the graph readable I decided to use a logarithmic y-scale however this is causing problems with the violin plot. Namely the violins bottom whiskers never round off and continue until negative infinity. This is not a problem when using a box plot (see the commented line). Note that the smallest values in my dataset are 1. Is there any way to round off the violin plot at the bottom similar to how it looks in the box plot?

here is the violin plot, and here is the box plot boxplot

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd


data = {'interval1':col1,'interval2':col2,'interval3':col3,'interval4':col4,'interval5':col5}
df = pd.DataFrame(data)
sns.set_style("whitegrid")
plt.yscale("log")
plt.ylim(10**(0), 10**7)

plt.xlabel("x")
plt.ylabel("y")
sns.violinplot(data=df, palette="muted", scale="count", inner="quartile")

#sns.boxplot(data=df, palette="muted")

plt.show()

CodePudding user response:

Currently, violinplot does not compute the density estimate in log space if the axis is log scaled; it computes the density on a linear grid and then scales those valuess. The same is true for boxplot, but boxplots are based on quantiles and those do not change when log transformed (but note the imbalance in outliers on the log plot version).

You'll need to log transform the data before giving it to either function.

  • Related