Home > OS >  Coloring sns barplot based on condition from another dataframe
Coloring sns barplot based on condition from another dataframe

Time:10-29

I am using following code to generate a barplot from a dataframe df1(x,y) as below. (For simplicity I have added sample values in the chart code itself).

sns.barplot(x=['A','B','C','D','E','F','G'],y=[10,40,20,5,60,30,80],palette='Blues_r')

This generates a beautiful chart with shades of blue color in descending order for bars A to G.

However, I need the colors to be in the order determined based on another dataframe df2 where there are values against A to G. I do not wish to change the order of A to G in this chart, so sorting the dataframe df1 based on values of df2 will not work for me.

So, say df2 is like this:

A    90
B    70
C    40
D    30
E    30
F    20
G    80

Notice that df2 can have same values (D and E), in which case I do not care whether D and E has same colors or adjacent from the palette. But there should not be any other bar with color in between D and E. That is, I need the chart to have bars starting from A and ending at G (fix order). However, colors will be in the order of df2 values.

How do we do this?

CodePudding user response:

You can try vanilla plt.bar:

x = ['A','B','C','D','E','F','G']
y=[10,40,20,5,60,30,80]

# assuming the columns of df2 are 'x' and 'color'
colors = df2.set_index('x').loc[x, 'color']

cmap = plt.get_cmap('Blues_r')
plt.bar(x,y, color=[cmap(c) for c in colors])

Output:

img

CodePudding user response:

You can use hue= with the values of the second dataframe. You'll also need dodge=False to tell Seaborn that you want a full bar per x-position.

import seaborn as sns
import pandas as pd
import numpy as np

df1 = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E', 'F', 'G'],
                    'y': [10, 40, 20, 5, 60, 30, 80]})
df2 = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E', 'F', 'G'],
                    'y': [90, 70, 40, 30, 30, 20, 80]})
sns.barplot(data=df1, x='x', y='y', palette='Blues_r', hue=df2['y'], dodge=False, legend=False)

sns.barplot with coloring from different dataframe

Note that this uses the values in df2['y] to make the relative coloring. If you just want to use the order, you can use np.argsort(df2['y']) to get the indices of the ordered array.

ax = sns.barplot(data=df1, x='x', y='y', palette='Blues_r', hue=np.argsort(df2['y']), dodge=False)
ax.legend_.remove() # remove the legend which consists of the indices 0,1,2,...
  • Related