Home > Software engineering >  Pandas conditional formatting on multiple columns with different criteria
Pandas conditional formatting on multiple columns with different criteria

Time:01-08

Happy new year,

I would like conditionally format a df with multiple criterias according to its columns, which looks like below:

business_entity 2021-H1 2022-H1 2022-H2 Movement
2 ABC 0 0 100 100
4 DEF 99 99 100 1
8 GHI 97 97 98 1
20 JKl 98 98 98 0
3 MNO 98 97 98 1
23 PQR 0 0 98 98
19 STU 98 97 98 1
22 VWX 96 98 98 0
17 DFD 97 99 98 -1

What I'm trying to do is to format the df with the following criteria:

for columns from 2021-H1 to 2022-H2: if values >= 89 (green), if values >=79.5 and <89 (orange), and if value <79.5 (red)

for "Movement" column: if value >=.05 (green), value >-.05 and <.05 (orange), and value <=-.05 (red)

I've done some research here and come up with the below code:

division_summary_table.style.apply(lambda x: ['background:green'
                                          if (colname=='2021-H1' and value >=89)
                                          else 'background:orange'
                                          if (colname=='2021-H1' and value >=79.5)
                                          else 'background:red'
                                          if (colname=='2021-H1' and value <79.5)
                                          else 'background:green'
                                          if (colname=='2022-H1' and value >=89)
                                          else 'background:orange'
                                          if (colname=='2022-H1' and value >=79.5)
                                          else 'background:red'
                                          if (colname=='2022-H1' and value <79.5)
                                          else 'background:green'
                                          if (colname=='2022-H2' and value >=89)
                                          else 'background:orange'
                                          if (colname=='2022-H2' and value >=79.5)
                                          else 'background:red'
                                          if (colname=='2022-H2' and value <79.5)
                                          else 'background:green'
                                          if (colname=='Movement' and value >=.5)
                                          else 'background:orange'
                                          if (colname=='Movement' and value >=0)
                                          else 'background:red'
                                          if (colname=='Movement' and value <-.5)
                                          else ' '
                                          for colname, value in x.items()],axis=1).format(precision=0)

It works fine for now but really "verbose". Since I need to produce multiple table like this and to apply the same formatting, I tried to to write a simple function then reuse it later:

def styler(df):
    for colname, value in df.items():

        if (colname=='2021-H1' and value >=89):
            return 'background:green'
        elif (colname=='2021-H1' and value >=79.5):
            return 'background:orange'
        elif (colname=='2021-H1' and value <79.5):
            return 'background:red'
        else:
            return ''


division_summary_table.apply(styler)

When I apply it gives me the below error:

"Function <function styler at 0x000002566633C940> resulted in the apply method collapsing to a Series".Usually, this is the result of the function returning a single value, instead of list-like.

I'm a beginner and really have no idea how to fix it. Appreciate your advise on a better way to accomplish what I want to achieve. One more thing I need help is how to save the tables in as png file with the conditional formattings.

Thanks.

CodePudding user response:

Let's define functions to colorize subset of columns:

def colorize_year(v):
    return np.select(
        [v >= 89, (v < 89) & (v >= 79.5), v < 79.5],
        ['background: green', 'background: orange', 'background: red']
    )

def colorize_movement(v):
    return np.select(
        [v >= .05, (v < .05) & (v >= -.05), v <= -.05],
        ['background: green', 'background: orange', 'background: red']
    )

(
    df.style
    .apply(colorize_movement, subset=['Movement'])
    .apply(colorize_year, subset=['2021-H1', '2022-H1', '2022-H2'])
)

Result

enter image description here

  • Related