Home > Back-end >  PandasTable: Highlight greatest value per group
PandasTable: Highlight greatest value per group

Time:09-21

I'm new to Python. I have a dataframe and a datatable to show it in a GUI. enter image description here

Step 1: For every couple of unique values in column A and B, I need to search for the maximum value in column 'C' I managed to do it with groupby (see code) In this picture, you can see the result of Groupby, with the max values in column C.

enter image description here

Step2 For every couple of unique values in column A and B, I need to highlight the corresponding max value in column C. The result should be like this: enter image description here

Can somebody help me, please?

import tkinter as tk
import pandas as pd
from pandastable import Table

df = pd.DataFrame({
    'A': ['alfa','beta','alfa','alfa','beta','beta'],
    'B': ['a','a','a','b','a','a'],
    'C': [1,2,3,4,7,6],
})

root = tk.Tk()

frame1 = tk.Frame(root)
frame1.pack()

pt = Table(frame1, dataframe=df) 
pt.show()

print(df.loc[df.groupby(["A", "B"])["C"].idxmax()] )

pt.setColorByMask( 'C', pt.model.df.iloc[:, 2] == pt.model.df.iloc[:, 2].max(),  'lightgreen'  )

root.mainloop()

CodePudding user response:

You can create a new column in df, indicating if it contains the max value. This column can then be used as a color mask in setColorByMask For example like this:

import tkinter as tk
import pandas as pd
from pandastable import Table

df = pd.DataFrame({
    'A': ['alfa','beta','alfa','gamma','beta','delta'],
    'B': ['a','b','a','b','a','b'],
    'C': [1,2,3,4,7,6],
})

root = tk.Tk()

frame1 = tk.Frame(root)
frame1.pack()

pt = Table(frame1, dataframe=df)
pt.show()

max_value_rows = df.loc[df.groupby(["A", "B"])["C"].idxmax()]

print(max_value_rows)

df['color_mask'] = False
df['color_mask'][max_value_rows.index] = True
color_mask = df['color_mask']

pt.setColorByMask( 'C', color_mask,  'lightgreen'  )

root.mainloop()

Edit: You can also create the color mask like this:

max_value_rows = df.loc[df.groupby(["A", "B"])["C"].idxmax()]

color_mask = pd.Series(len(df) * [False])
color_mask[max_value_rows.index] = True

pt.setColorByMask( 'C', color_mask,  'lightgreen'  )

CodePudding user response:

To highlight an entire row, you may take a look of the discussion here:

How to use Pandas stylers for coloring an entire row based on a given column?

  • Related