*edited DataFrame random generator
I have 2 dfs, one used as a mask for the other.
rndm = pd.DataFrame(np.random.randint(0,15,size=(100, 4)), columns=list('ABCD'))
rndm_mask = pd.DataFrame(np.random.randint(0,2,size=(100, 4)), columns=list('ABCD'))
I want to use 2 conditions to change the values in rndm:
- Is the value the mode of the column?
- rndm_mask == 1
What works so far:
def colorBoolean(val):
return f'background-color: {"red" if val else ""}'
rndm.style.apply(lambda _: rndm_mask.applymap(colorBoolean), axis=None)
# helper function to find Mode
def highlightMode(s):
# Get mode of columns
mode_ = s.mode().values
# Apply style if the current value is in mode_ array (len==1)
return ['background-color: yellow' if v in mode_ else '' for v in s]
Issue:
I'm unsure how to chain both functions in a way that values in rndm are highlighted only if they match both criteria (ie. value must be the most frequent value in column as well as be True in rndm_mask).
I appreciate any advice! Thanks
CodePudding user response:
Try this, since your df_bool dataframe is a mask (identically indexed) then you can referred to the df_bool object inside the style function, where x.name is the name of the column passed in via df.apply:
df = pd.DataFrame({'A':[5.5, 3, 0, 3, 1],
'B':[2, 1, 0.2, 4, 5],
'C':[3, 1, 3.5, 6, 0]})
df_bool = pd.DataFrame({'A':[0, 1, 0, 0, 1],
'B':[0, 0, 1, 0, 0],
'C':[1, 1, 1, 0, 0]})
# I want to use 2 conditions to change the values in df:
# Is the value the mode of the column?
# df_bool == 1
# What works so far:
def colorBoolean(x):
return [f'background-color: red' if v else '' for v in df_bool[x.name]]
# helper function to find Mode
def highlightMode(s):
# Get mode of columns
mode_ = s.mode().values
# Apply style if the current value is in mode_ array (len==1)
return ['background-color: yellow' if v in mode_ else '' for v in s]
df.style.apply(colorBoolean).apply(highlightMode)
Output:
Or the other way:
df.style.apply(highlightMode).apply(colorBoolean)
Output:
Update
Highlight where both are true:
def highlightMode(s):
# Get mode of columns
mode_ = s.mode().values
# Apply style if the current value is in mode_ array (len==1)
return ['background-color: yellow' if (v in mode_) & b else '' for v, b in zip(s, df_bool[s.name])]
df.style.apply(highlightMode)
Output: