I have 4 columns of string values and I want to visiually compare distinc values in 3 columns to the values in the first columns.
So, if value in column 2, 3 and 4 is not in column 1, I want to color this cell in corresponding column.
I don't know how can it be implemented, as I never worked with pd.DataFrame
coloring
Example of dataframe:
d = {'column 1': ['red', 'green', 'blue', 'white', ''],
'column 2': ['red', 'blue', 'white', 'green', 'yellow'],
'column 3': ['blue', 'yellow', 'brown', '', ''],
'column 4': ['red', 'white', 'blue', 'green', '']}
df = pd.DataFrame(data = d)
Here I need to color the following:
in column 2 -- yellow cell
in column 3 -- yellow and brown cells
in column 4 -- nothing
Don't mind empty strings here, in my dataframe they are NaN
s
CodePudding user response:
You could try as follows.
import pandas as pd
import numpy as np
def f(col):
return np.where(col.isin(df['column 1']),'','background-color: IndianRed')
df.style.apply(f, subset=['column 2', 'column 3', 'column 4'])
Result:
If you want to highlight the missing values in their 'own' color, you could write something like this:
def f(col):
return [f'background-color: {v}' if not pd.isnull(v) else ''
for v in col.mask(col.isin(df['column 1']))]
df.style.apply(f, subset=['column 2', 'column 3', 'column 4'])
Result:
CodePudding user response:
I came up with something like this
import matplotlib
colors = dict(zip(df['column 1'].unique(),
(f'background-color: {c}' for c in matplotlib.colors.cnames.values())))
df.style.applymap(colors.get, subset=['column 1', 'column 2', 'column 3', 'column 4'])
``