Home > Net >  How to get top 3 values in a row by ID and return them in a new column pandas
How to get top 3 values in a row by ID and return them in a new column pandas

Time:06-13

I understand the following will return the highest value in a row and will store the column ID in a new column called 'Top Value'

df['Top Value'] = df[['VolumeOIH','Volume KRE']].idxmax(axis=1)

Now how do I do the same but with the top 3 columnID values

This is my dataframe:

     VolumeKBE  VolumeKRE  VolumeIYR  VolumeITB  VolumeSMH
0      2722.0    51852.0    10873.0    28562.0    84673.0
1      2500.0    54027.0     7157.0    11278.0    42034.0
2      2279.0    46517.0     1700.0    20291.0    64202.0
3      8200.0    43994.0     7500.0    34564.0   260018.0
4      9688.0    52993.0     4400.0    25912.0    79126.0
..        ...        ...        ...        ...        ...
64     1200.0    11411.0    19891.0    29535.0    37648.0
65     3500.0    17334.0    24248.0    25006.0    58842.0
66     1200.0    16353.0    23023.0    30704.0   118051.0
67     5700.0    13611.0    12139.0    22182.0    35798.0
68      578.0    11291.0     5780.0    27310.0    68584.0

CodePudding user response:

For top3 columns by values use numpy.argsort with converting columns and values of DataFrame to numpy array:

N = 3
c = df.columns.to_numpy()
topN = c[np.argsort(-df.to_numpy())[:, :N]]

cols = [f'top{x 1}' for x in range(N)]
df = pd.DataFrame(topN, index=df.index, columns=cols)
print (df)
         top1       top2       top3
0   VolumeSMH  VolumeKRE  VolumeITB
1   VolumeKRE  VolumeSMH  VolumeITB
2   VolumeSMH  VolumeKRE  VolumeITB
3   VolumeSMH  VolumeKRE  VolumeITB
4   VolumeSMH  VolumeKRE  VolumeITB
64  VolumeSMH  VolumeITB  VolumeIYR
65  VolumeSMH  VolumeITB  VolumeIYR
66  VolumeSMH  VolumeITB  VolumeIYR
67  VolumeSMH  VolumeITB  VolumeKRE
68  VolumeSMH  VolumeITB  VolumeKRE
  • Related