Home > Back-end >  Keep/select rows with the n highest values in last row
Keep/select rows with the n highest values in last row

Time:10-12

So I have a dataframe as follows:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.array([[1, 2, 3], [4, 3, 6], [7, 2, 9]]),
                   columns=['a', 'b', 'c'])

df

Output:

a b c
1 2 3
4 3 6
7 2 9

I want to select or keep the two columns, with the highest values in the last row. What is the best way to approach? So in fact I just want to select or keep column 'a' due to value 7 and column 'c' due to value 9.

CodePudding user response:

Try:

df = df[df.iloc[-1].nlargest(2).index]

Output:

   c  a
0  3  1
1  6  4
2  9  7

CodePudding user response:

If you want to keep original column sequence as well, you can use Index.intersection() together with .nlargest(), as follows:

df[df.columns.intersection(df.iloc[-1].nlargest(2).index, sort=False)]

Result:

   a  c
0  1  3
1  4  6
2  7  9
  • Related