Home > other >  Marking final rows in DataFrame by unique values in other columns
Marking final rows in DataFrame by unique values in other columns

Time:12-09

For a DataFrame like:

   col1  col2  val
0     1     1    1
1     1     1    2
2     1     1    3
3     1     2    1
4     1     2    2
5     1     2    3
6     2     1    1
7     2     1    2
8     2     2    1

I want to produce:

   col1  col2  val  final
0     1     1    1  False
1     1     1    2  False
2     1     1    3   True
3     1     2    1  False
4     1     2    2  False
5     1     2    3   True
6     2     1    1  False
7     2     1    2   True
8     2     2    1   True

Essentially marking the final val (largest) for each unique value of col1 and col2.

(Data is sorted ascending col1 > col2 > val)

I have tried looping and setting each as they come up like the following:

df["final"] = False

for col1 in df["col1"].unique():
    for col2 in df["col2"].unique(): 
        df[df["col1"].eq(col1) & df["col2"].eq(col2)].iloc[-1]["final"] = True

But that doesn't set values.

CodePudding user response:

Try with groupby

df['new'] = df.val.eq(df.groupby(['col1','col2']).val.transform('max'))
df
Out[383]: 
   col1  col2  val    new
0     1     1    1  False
1     1     1    2  False
2     1     1    3   True
3     1     2    1  False
4     1     2    2  False
5     1     2    3   True
6     2     1    1  False
7     2     1    2   True
8     2     2    1   True
  • Related