Home > Net >  Pandas Styling with Conditional Rules
Pandas Styling with Conditional Rules

Time:11-05

I was trying to style a Pandas Dataframe using 2 different columns. I was successful as long as the conditional was about the column itself, however when it depends on another one, I was unable to get the desired results.

I want to color the cell in "Date II" if "Date I" is in the past.

def date_pii(row):
        ret = ["" for _ in row.index]
        print(row['Date PI'])
        if row['Date PI'] < datetime.now():
            ret[row.index.get_loc("Date PII")] = "background-color: red"
            return ret

styler = df3.style \
        .applymap(lambda x: 'background-color: %s' % 'red' if x <= datetime.now() else '', subset=['Date PI']) \
        .applymap(lambda x: 'background-color: %s' % 'yellow' if x < datetime.now()   timedelta(days=30) else '',
                  subset=['Date PII']) \
        .applymap(lambda x: 'background-color: %s' % 'orange' if x <= datetime.now() else '', subset=['Date PII']) \
        .applymap(lambda x: 'background-color: %s' % 'grey' if pd.isnull(x) else '', subset=['Date PI'])\
        .applymap(lambda x: 'background-color: %s' % 'grey' if pd.isnull(x) else '', subset=['Date PII'])\
        .apply(date_pii, axis=1) ----> THIS IS THE ISSUE

styler.to_excel(writer, sheet_name='Report Paris', index=False)

At runtime I get the following error:

ValueError: Function <function generate_report_all.<locals>.date_pii at 0x7fd3964d9160> returned the wrong shape.
Result has shape: (532,)
Expected shape:   (532, 10)

The dataframe looks like this: df3

The first orange cell in "Date PII" is correct, however, the remaining ones (where PI is red) I would like them to become red as well.

Thanks for the help!

CodePudding user response:

The general approach to this type of problem is to pass the specified columns as a subset to styled DataFrame


Setup DataFrame randomly generated always relative to the current date (styles will be consistent while dates will not be):

import numpy as np
import pandas as pd
from numpy.random import Generator, MT19937

norm_today = pd.Timestamp.now().normalize()
rng = Generator(MT19937(1023))


def md(lower_bound, upper_bound, rng_=rng):
    return pd.Timedelta(days=rng_.integers(lower_bound, upper_bound))


df3 = pd.DataFrame({
    'Desc': [
        'PII within 30 days',  # PII yellow
        'PII in past and PI in future',  # PII orange
        'PI past',  # Both red
        'PI empty',  # grey
        'PII empty',  # grey
        'PII in future but not within 30 days'  # No Styles
    ],
    'Date PII': [norm_today   md(1, 10), norm_today - md(1, 10),
                 norm_today, norm_today, np.nan,
                 norm_today   md(40, 50)],
    'Date PI': [norm_today, norm_today   md(1, 10),
                norm_today - md(1, 10), np.nan, norm_today,
                norm_today]
})
Desc Date PII Date PI
PII within 30 days 2021-11-06 00:00:00 2021-11-03 00:00:00
PII in past and PI in future 2021-10-31 00:00:00 2021-11-11 00:00:00
PI past 2021-11-03 00:00:00 2021-11-01 00:00:00
PI empty 2021-11-03 00:00:00 NaT
PII empty NaT 2021-11-03 00:00:00
PII in future but not within 30 days 2021-12-19 00:00:00 2021-11-03 00:00:00
  • Related