Color distinct values in pandas columns from the other column-CodePudding

I have 4 columns of string values and I want to visiually compare distinc values in 3 columns to the values in the first columns.

So, if value in column 2, 3 and 4 is not in column 1, I want to color this cell in corresponding column.

I don't know how can it be implemented, as I never worked with pd.DataFrame coloring

Example of dataframe:

d = {'column 1': ['red', 'green', 'blue', 'white', ''],
    'column 2': ['red', 'blue', 'white', 'green', 'yellow'],
    'column 3': ['blue', 'yellow', 'brown', '', ''],
    'column 4': ['red', 'white', 'blue', 'green', '']}


df = pd.DataFrame(data = d)

Here I need to color the following:

in column 2 -- yellow cell

in column 3 -- yellow and brown cells

in column 4 -- nothing

Don't mind empty strings here, in my dataframe they are NaNs

CodePudding user response：

You could try as follows.

import pandas as pd
import numpy as np

def f(col):
    return np.where(col.isin(df['column 1']),'','background-color: IndianRed')

df.style.apply(f, subset=['column 2', 'column 3', 'column 4'])

Result:

If you want to highlight the missing values in their 'own' color, you could write something like this:

def f(col):
    return [f'background-color: {v}' if not pd.isnull(v) else '' 
            for v in col.mask(col.isin(df['column 1']))]

df.style.apply(f, subset=['column 2', 'column 3', 'column 4'])

Result:

CodePudding user response：

I came up with something like this

import matplotlib

colors = dict(zip(df['column 1'].unique(),
                  (f'background-color: {c}' for c in matplotlib.colors.cnames.values())))

df.style.applymap(colors.get, subset=['column 1', 'column 2', 'column 3', 'column 4'])
``