Home > Blockchain >  Get and assign value to rows in group with index greater than those from idxmax()
Get and assign value to rows in group with index greater than those from idxmax()

Time:06-04

The objective is to assign 1s to any index in the group that is a higher value than the one retrieved from idxmax()

import numpy as np
import pandas as pd
df = pd.DataFrame({'id':[1, 1, 1, 2, 2, 2, 3, 3, 3], 'val':[1,np.NaN, 0, np.NaN, 1, 0, 1, 0, 0]})

   id  val
0   1  1.0
1   1  NaN
2   1  0.0
3   2  NaN
4   2  1.0
5   2  0.0
6   3  1.0
7   3  0.0
8   3  0.0

We can use idxmax() to get the index values for the highest value in each group

test = df.groupby('id')['val'].idxmax()

id
1    0
2    4
3    6

The objective is to transform the data to look as such (which is that every value in group that has a higher index than the one from idxmax() gets assigned a 1.

   id  val
0   1  1.0
1   1  1.0
2   1  1.0
3   2  NaN
4   2  1.0
5   2  1.0
6   3  1.0
7   3  1.0
8   3  1.0

This question does not necessarily need to be done with idxmax(). Open to any suggestions.

CodePudding user response:

If i understand correctly the problem, you can use apply and np.where

nd = df.groupby('id')['val'].idxmax().tolist()
df['val'] = df.groupby('id')['val'].transform(lambda x: np.where(x.index>nd[x.name-1], 1, x))

df

Output:

    id  val
0   1   1.0
1   1   1.0
2   1   1.0
3   2   NaN
4   2   1.0
5   2   1.0
6   3   1.0
7   3   1.0
8   3   1.0

CodePudding user response:

Try

df = pd.DataFrame({'id':[1, 1, 1, 2, 2, 2, 3, 3, 3], 'val':[1,np.NaN, 0, np.NaN, 1, 0, 1, 0, 0]})

# cummax fills everything after the first True to True in each group
# mask replaces the 0s that were originally nan by nan
df.val = df.val.eq(1).groupby(df.id).cummax().astype(int).mask(lambda x: x.eq(0) & df.val.isna())
df

enter image description here

  • Related