Home > front end >  2nd largest value in each row
2nd largest value in each row

Time:08-24

How can I create a column col4 that contains the 2nd largest value in each row

df = pd.DataFrame([[4, 1, 5], 
               [5, 2, 9],
               [2, 9, 3], 
               [8, 5, 4]], 
              columns=["col_A", "col_B", "col_C"])

cols = np.array(df.columns)

df['col4'] = df.nlargest(2, columns=cols) #wrong

CodePudding user response:

You can use indexing on the output of np.sort:

N = 2
df['col4'] = np.sort(df)[:, -N]

Alternative with apply:

df['col4'] = df.apply(lambda r: r.nlargest(2).iloc[-1], axis=1)

output:

   col_A  col_B  col_C  col4
0      4      1      5     4
1      5      2      9     5
2      2      9      3     3
3      8      5      4     5

CodePudding user response:

For each row, you could sort the values and take the second last one as follow :

df["col4"] = df.apply(lambda x: sorted(x)[-2], axis=1)
  • Related