Home > front end >  Color an entire DataFrame based on distinct values in a particular column
Color an entire DataFrame based on distinct values in a particular column

Time:03-10

This question is a continuation to my previously asked question (enter image description here

CodePudding user response:

The error says, it expects (5,3) here there are 5 rows and 3 columns but it only gets 5 elements. It is expecting color for each column in a row which causes the error.

If you follow something like the other answer in stackoverflow, I think it should work. All you need is pd.Series(colors[row['Country']], row.index) in your test_check function. You can try the following:

import pandas as pd
import matplotlib

d1 = pd.DataFrame({"Country":['xx','xx','xy','xz','xz'],
               "year":['y1','y2','y1','y1','y2'],
               "population":[100,200,120,140,190]})



colors = dict(zip(d1['Country'].unique(),
              (f'background-color: {c}' for c in matplotlib.colors.cnames.values())))


def test_check(row):
    
    return pd.Series(colors[row['Country']], row.index)
    
d1.style.apply(test_check, axis=1)

Adding more explanation:

If you try colors[d1.iloc[0]['Country']] i.e. colors[row['Country']] where row is first row in the dataframe it returns value from dictionary something like: 'background-color: #F0F8FF' And passing test_check(d1.iloc[0]) it returns same color for all three columns of the row i.e. as below for first row:

Country       background-color: #F0F8FF
year          background-color: #F0F8FF
population    background-color: #F0F8FF

That way for each row gets same color value for the three columns which is applied via Style. You can see what values are returned by d1.apply(test_check, axis=1)

  • Related