I am using following code to generate a barplot from a dataframe df1(x,y) as below. (For simplicity I have added sample values in the chart code itself).
sns.barplot(x=['A','B','C','D','E','F','G'],y=[10,40,20,5,60,30,80],palette='Blues_r')
This generates a beautiful chart with shades of blue color in descending order for bars A to G.
However, I need the colors to be in the order determined based on another dataframe df2 where there are values against A to G. I do not wish to change the order of A to G in this chart, so sorting the dataframe df1 based on values of df2 will not work for me.
So, say df2 is like this:
A 90
B 70
C 40
D 30
E 30
F 20
G 80
Notice that df2 can have same values (D and E), in which case I do not care whether D and E has same colors or adjacent from the palette. But there should not be any other bar with color in between D and E. That is, I need the chart to have bars starting from A and ending at G (fix order). However, colors will be in the order of df2 values.
How do we do this?
CodePudding user response:
You can try vanilla plt.bar
:
x = ['A','B','C','D','E','F','G']
y=[10,40,20,5,60,30,80]
# assuming the columns of df2 are 'x' and 'color'
colors = df2.set_index('x').loc[x, 'color']
cmap = plt.get_cmap('Blues_r')
plt.bar(x,y, color=[cmap(c) for c in colors])
Output:
CodePudding user response:
You can use hue=
with the values of the second dataframe. You'll also need dodge=False
to tell Seaborn that you want a full bar per x-position.
import seaborn as sns
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E', 'F', 'G'],
'y': [10, 40, 20, 5, 60, 30, 80]})
df2 = pd.DataFrame({'x': ['A', 'B', 'C', 'D', 'E', 'F', 'G'],
'y': [90, 70, 40, 30, 30, 20, 80]})
sns.barplot(data=df1, x='x', y='y', palette='Blues_r', hue=df2['y'], dodge=False, legend=False)
Note that this uses the values in df2['y]
to make the relative coloring. If you just want to use the order, you can use np.argsort(df2['y'])
to get the indices of the ordered array.
ax = sns.barplot(data=df1, x='x', y='y', palette='Blues_r', hue=np.argsort(df2['y']), dodge=False)
ax.legend_.remove() # remove the legend which consists of the indices 0,1,2,...