I have two pandas dataframes containing data for three different categories: 'a', 'b' and 'c'.
import pandas as pd
import numpy as np
n=100
df_a = pd.DataFrame({'id': np.ravel([['a' for i in range(n)], ['b' for i in range(n)], ['c' for i in range(n)]]),
'val': np.random.normal(0, 1, 3*n)})
df_b = pd.DataFrame({'id': np.ravel([['a' for i in range(n)], ['b' for i in range(n)], ['c' for i in range(n)]]),
'val': np.random.normal(1, 1, 3*n)})
I would like to illustrate the differences in 'a', 'b' and 'c' between the two dataframes, and for that I want to use boxplots. I.e., for each category ('a', 'b' and 'c'), I want to make side-by-side boxplots - and they should all be in the same figure.
So one figure containing 6 boxplots, 2 per category. How can I achieve this the easiest?
CodePudding user response:
IIUC:
import matplotlib.pyplot as plt
fig, axes = plt.subplots(3, 2)
for j, df in enumerate([df_a, df_b]):
for i, cat in enumerate(sorted(df['id'].unique())):
df[df['id'] == cat].boxplot('val', 'id', ax=axes[i, j])
plt.tight_layout()
plt.show()
CodePudding user response:
does this help? I tried to make it somewhat dynamic/ flexible
import matplotlib.pyplot as plt
import pandas
import seaborn as sns
ids = [val for val in df_a["id"].unique() for _ in (0, 1)]
fig, ax = plt.subplots(len(ids)//2,2, figsize=(10,10))
plt.subplots_adjust(hspace=0.5, wspace=0.3)
plt.suptitle("df_a vs. df_b")
ax = ax.ravel()
for i, id in enumerate(ids):
if i%2 == 0:
ax[i] = sns.boxplot(x=df_a[df_a.id == id]["val"], ax = ax[i])
else:
ax[i] = sns.boxplot(x=df_b[df_b.id == id]["val"], ax = ax[i])
ax[i].set_title(id)
sns.despine()