I have this code:
import numpy as np, matplotlib.pyplot as plt, seaborn as sns
fig = plt.figure(1)
sns.set(style="whitegrid", color_codes=True)
data=departments.value_counts()[:10]
pal = sns.color_palette("Blues_d", len(data))
rank = data.argsort().argsort()
sns.barplot(y=data.index, x=data, palette=np.array(pal[::1])[rank])
plt.title("Top departments", y=1.02)
fig.savefig('Top_departments', bbox_inches='tight', dpi=300)
My data looks like this:
Chemistry 19
Computer Science 13
Physics 10
Epidemiology and Biostatistics 8
Psychology 8
Mechanical and Industrial Engineering 8
Sociology 7
Academic 7
Bioengineering 7
Biochemistry and Molecular Genetics 6
Plot is attached. The issue is that I am getting bars with different shade of blue for say: Sociology, Academic and Bioengineering, while they shoudl be the same shade because they have the same numeric value.
CodePudding user response:
The question's code seems to assign the list of sequential colors to the values from largest to smallest. When ex aequos are encountered, the order is not well-defined, but the approach will always assign different colors, although the values are equal. The approach seems to be meant to color the bars from high to low, when they aren't already sorted and when all values are different.
Here is an approach to color-by-value:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
fig = plt.figure(1)
sns.set(style="whitegrid", color_codes=True)
data = pd.Series([19, 13, 10, 8, 8, 8, 7, 7, 7, 6],
index=['Chemistry', 'Computer Science', 'Physics', 'Epidemiology and Biostatistics',
'Psychology', 'Mechanical and Industrial Engineering', 'Sociology',
'Academic', 'Bioengineering', 'Biochemistry and Molecular Genetics'])
cmap = sns.color_palette("Blues_d", as_cmap=True)
norm = plt.Normalize(data.min(), data.max())
sns.barplot(y=data.index, x=data, palette=cmap(norm(data.values)))
plt.title("Top departments", y=1.02)
plt.tight_layout()
plt.show()
To just color the bars from dark to light, you could use the reversed palette directly without involving argsort
:
pal = sns.color_palette("Blues_r_d", len(data))
sns.barplot(y=data.index, x=data, palette=pal)
plt.title("Top departments", y=1.02)