Home > Back-end >  Most efficient way for finding 3 rows with maximum value in column?
Most efficient way for finding 3 rows with maximum value in column?

Time:05-09

Lets us say there is a dataframe df

Name  Balance
A     1000
B     5000
C     3000
D     6000
E     2000
F     5000

I am looking for an approach through which I can get three rows with highest balances among all.

df['balance'].get_indices_max(n=3) # where is no. of results required

Output when these indices will be used to get rows:

D 6000
F 5000
B 5000

CodePudding user response:

Answer

df = Df({"Name":list("ABCDEF"), "Balance":[1000,5000,3000,6000,2000,5000]})
index = df["Balance"].nlargest(3).index
df.loc[index]

Output

  Name  Balance
3    D     6000
1    B     5000
5    F     5000

Attantion

The columns that are not specified are returned as well, but not used for ordering. This method is equivalent to df.sort_values(columns, ascending=False).head(n), but more performant.

Reference

CodePudding user response:

I usual do

out = df.sort_values('Balance').iloc[3:]
Out[476]: 
  Name  Balance
1    B     5000
5    F     5000
3    D     6000
  • Related