Home > Back-end >  Add a column filled with values that correspond to the max value in another column for every value i
Add a column filled with values that correspond to the max value in another column for every value i

Time:12-24

Having this dataframe

    state   in  out 
0   case1    1  -5  
1   case2    0  -1  
2   case2   -1   8
3   case1   -2   5
4   case2   -2   1

I need to make an additional column with values from "in" that correspond to max values in "out" for every "state" (case_1, case_2).

    state   in  out  new
0   case_1   1  -5   -2
1   case_2   0  -1   -1
2   case_2  -1   8   -1
3   case_1  -2   5   -2
4   case_2  -2   1   -1

CodePudding user response:

Try:

df['new'] = df.loc[df['state'].map(df.groupby('state')['out'].idxmax()), 'in'].values
print(df)

# Output:
   state  in  out  new
0  case1   1   -5   -2
1  case2   0   -1   -1
2  case2  -1    8   -1
3  case1  -2    5   -2
4  case2  -2    1   -1

CodePudding user response:

Let us try transform

df['new'] = df.set_index('in').groupby('state')['out'].transform('idxmax').values
df
Out[99]: 
   state  in  out  new
0  case1   1   -5   -2
1  case2   0   -1   -1
2  case2  -1    8   -1
3  case1  -2    5   -2
4  case2  -2    1   -1

CodePudding user response:

Another approach that uses merge function:

new = pd.merge(df,df.groupby('state',as_index=False)['out'].max()).drop('out',axis=1).rename({'in':'new'},axis=1)
>>> pd.merge(df,new,on='state',how='left')

  state  in  out  new
0  case1   1   -5   -2
1  case2   0   -1   -1
2  case2  -1    8   -1
3  case1  -2    5   -2
4  case2  -2    1   -1
  • Related