Get and assign value to rows in group with index greater than those from idxmax()-CodePudding

The objective is to assign 1s to any index in the group that is a higher value than the one retrieved from idxmax()

import numpy as np
import pandas as pd
df = pd.DataFrame({'id':[1, 1, 1, 2, 2, 2, 3, 3, 3], 'val':[1,np.NaN, 0, np.NaN, 1, 0, 1, 0, 0]})

   id  val
0   1  1.0
1   1  NaN
2   1  0.0
3   2  NaN
4   2  1.0
5   2  0.0
6   3  1.0
7   3  0.0
8   3  0.0

We can use idxmax() to get the index values for the highest value in each group

test = df.groupby('id')['val'].idxmax()

id
1    0
2    4
3    6

The objective is to transform the data to look as such (which is that every value in group that has a higher index than the one from idxmax() gets assigned a 1.

This question does not necessarily need to be done with idxmax(). Open to any suggestions.

CodePudding user response：

If i understand correctly the problem, you can use apply and np.where

nd = df.groupby('id')['val'].idxmax().tolist()
df['val'] = df.groupby('id')['val'].transform(lambda x: np.where(x.index>nd[x.name-1], 1, x))

df

Output:

    id  val
0   1   1.0
1   1   1.0
2   1   1.0
3   2   NaN
4   2   1.0
5   2   1.0
6   3   1.0
7   3   1.0
8   3   1.0

CodePudding user response：

Try

df = pd.DataFrame({'id':[1, 1, 1, 2, 2, 2, 3, 3, 3], 'val':[1,np.NaN, 0, np.NaN, 1, 0, 1, 0, 0]})

# cummax fills everything after the first True to True in each group
# mask replaces the 0s that were originally nan by nan
df.val = df.val.eq(1).groupby(df.id).cummax().astype(int).mask(lambda x: x.eq(0) & df.val.isna())
df