Home > OS >  Labeling colors in seaborn without hue
Labeling colors in seaborn without hue

Time:12-21

I am trying to make a kernel density plot in seaborn using four datasets and using seaborn, I can plot the kernel density estimation for four datasets in one graph, which is what I want. But I don't know how to make the label that tells you which color belongs to which dataset (red: Dataset 1, green: dataset 2, etc.). Usually, it is done by passing the hue parameter when it is for labeling colors for columns within a dataset. But since I am using different datasets for the same plot. How can I make a label for the colors?

Here are the code and the plot:


sns.kdeplot(data=group1_df, x="PEAK SPEED")
sns.kdeplot(data=group2_df, x="PEAK SPEED")
sns.kdeplot(data=group3_df, x="PEAK SPEED")
sns.kdeplot(data=group4_df, x="PEAK SPEED")

Plot

CodePudding user response:

The easiest way, is to give each kdeplot a label, and which matplotlib will use in the legend.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

np.random.seed(20221221)
group1_df = pd.DataFrame({"PEAK SPEED": np.random.randn(200).cumsum()})
group2_df = pd.DataFrame({"PEAK SPEED": np.random.randn(500).cumsum()})
group3_df = pd.DataFrame({"PEAK SPEED": np.random.randn(700).cumsum()})
group4_df = pd.DataFrame({"PEAK SPEED": np.random.randn(800).cumsum()})

sns.kdeplot(data=group1_df, x="PEAK SPEED", label='group 1')
sns.kdeplot(data=group2_df, x="PEAK SPEED", label='group 2')
sns.kdeplot(data=group3_df, x="PEAK SPEED", label='group 3')
sns.kdeplot(data=group4_df, x="PEAK SPEED", label='group 4')

plt.legend()
plt.show()

kdeplots with labels

Alternatively, you can create one large dataframe, concatenating the four original ones. That enables options that go together with hue, such as using a common_norm (default: True), stacking the curves (multiple='stack') or a specific color palette.

df = pd.concat({'group 1': group1_df, 'group 2': group2_df, 'group 3': group3_df, 'group 4': group4_df})
df = df.reset_index(level=0).reset_index(drop=True).rename(columns={'level_0': 'Group'})

sns.kdeplot(data=df, x="PEAK SPEED", hue='Group', common_norm=False, legend=True)

sns.kdeplot with long dataframe

  • Related