Home > OS >  Pandas - find second largest value in each row
Pandas - find second largest value in each row

Time:12-03

Good morning! I have a three column dataframe and need to find the second largest value per each row

DATA=pd.DataFrame({"A":[10,11,4,5],"B":[23,8,3,4],"C":[12,7,11,9]})

    A   B   C
0  10  23  12
1  11   8   7
2   4   3  11
3   5   4   9

I tried using nlargest but it seems to be column based and can't find a pandas solution for this problem. Thank you in advance!

CodePudding user response:

import pandas as pd


df=pd.DataFrame({"A":[10,11,4,5],"B":[23,8,3,4],"C":[12,7,11,9]})

# find the second largest value for each row
df['largest2'] = df.apply(lambda x: x.nlargest(2).iloc[1], axis=1)


print(df.head())

result:

    A   B   C  largest2
0  10  23  12        12
1  11   8   7         8
2   4   3  11         4
3   5   4   9         5

CodePudding user response:

In A Python List

mylist = [1, 2, 8, 3, 12]
print(sorted(mylist, reverse=True)[1])

In A Python Pandas List

import pandas as pd
df=pd.DataFrame({"A":[10,11,4,5],"B":[23,8,3,4],"C":[12,7,11,9]})
print(sorted(df['A'].nlargest(4))[3])
print(sorted(df['B'].nlargest(4))[3])
print(sorted(df['C'].nlargest(4))[3])

In A Python Pandas List mk.2

import pandas as pd
df=pd.DataFrame({"A":[10,11,4,5],"B":[23,8,3,4],"C":[12,7,11,9]})

num_of_rows = len(df.index)
second_highest = num_of_rows - 2
print(sorted(df['A'].nlargest(num_of_rows))[second_highest])
print(sorted(df['B'].nlargest(num_of_rows))[second_highest])
print(sorted(df['C'].nlargest(num_of_rows))[second_highest])

In A Python Pandas List mk.3

import pandas as pd
df=pd.DataFrame({"A":[10,11,4,5],"B":[23,8,3,4],"C":[12,7,11,9]})

col_names
num_of_rows = len(df.index)
second_highest = num_of_rows - 2

for col_name in col_names:
     print(sorted(df[col_name].nlargest(num_of_rows))[second_highest])

In A Python Pandas List mk.4

import pandas as pd
df=pd.DataFrame({"A":[10,11,4,5],"B":[23,8,3,4],"C":[12,7,11,9]})

top_n = (len(df.columns))
pd.DataFrame({n: df.T[col].nlargest(top_n).index.tolist() 
                  for n, col in enumerate(df.T)}).T

  • Related