I have a following structure of the dataframe:
data = [
[12, [0.1, 0.2, 0.3, 0.4, 0.5]],
[14, [0.8, 0.7, 0.6, 0.4, 0.2]]
# .... and so on
]
df = pd.DataFrame(data, columns=['index', 'distribution'])
How to build a boxplot chart(s) where:
- each box-and-whisker will show the distribution (using the box/whiskers/outliers) of the
distribution
column above for each index - each box-and-whisker will aggregate the distributions with the same
index
(e.g. if theindex
value is the same, thedistribution
will be merged)
CodePudding user response:
You can use pandas' .explode()
to convert the pesky lists into a long form dataframe. Seaborn is by far the easiest sway to create a matplotlib-style boxplot from a dataframe. Seaborn will automatically group values belonging to the same 'index'.
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
data = [
[12, [0.1, 0.2, 0.3, 0.4, 0.5]],
[14, [0.8, 0.7, 0.6, 0.4, 0.2]]
# .... and so on
]
df = pd.DataFrame(data, columns=['index', 'distribution'])
sns.boxplot(data=df.explode('distribution'), x='index', y='distribution', palette='magma')