after processing some data I got df, now I need to get max 3 value from the data frame with column name
data=[[4.12,3,2],[1.0123123,-6.12312,5.123123],[-3.123123,-8.512323,6.12313]]
df = pd.DataFrame(data,columns =['a','b','c'],index=['aa','bb','cc'])
df
output:
a b c
aa 4.120000 3.000000 2.000000
bb 1.012312 -6.123120 5.123123
cc -3.123123 -8.512323 6.123130
Now I assigned each value with a columns name
df1 = df.astype(str).apply(lambda x:x '=' x.name)
a b c
aa 4.12=a 3.0=b 2.0=c
bb 1.0123123=a -6.12312=b 5.123123=c
cc -3.123123=a -8.512323=b 6.12313=c
I need to get the max, I have tried to sort the data frame but not able to get the output
what I need is
final_df
max=1 max=2 max=3
aa 4.12=a 3.0=b 2.0=c
bb 5.123123=c 1.0123123=a -6.12312=b
cc 6.12313=c -3.123123=a -8.512323=b
CodePudding user response:
I suggest you proceed as follows:
import pandas as pd
data=[[4.12,3,2],[1.0123123,-6.12312,5.123123],[-3.123123,-8.512323,6.12313]]
df = pd.DataFrame(data,columns =['a','b','c'],index=['aa','bb','cc'])
# first sort values in descending order
df.values[:, ::-1].sort(axis=1)
# then rename row values
df1 = df.astype(str).apply(lambda x: x '=' x.name)
# rename columns
df1.columns = [f"max={i}" for i in range(1, len(df.columns) 1)]
Result as desired:
max=1 max=2 max=3
aa 4.12=a 3.0=b 2.0=c
bb 5.123123=a 1.0123123=b -6.12312=c
cc 6.12313=a -3.123123=b -8.512323=c
CodePudding user response:
As the solution proposed by @GuglielmoSanchini does not give the expected result, It works as follows:
# Imports
import pandas as pd
import numpy as np
# Data
data=[[4.12,3,2],[1.0123123,-6.12312,5.123123],[-3.123123,-8.512323,6.12313]]
df = pd.DataFrame(data,columns =['a','b','c'],index=['aa','bb','cc'])
df1 = df.astype(str).apply(lambda x:x '=' x.name)
data = []
for index, row in df1.iterrows():
# the indices of the numbers sorted in descending order
indices_max = np.argsort([float(item[:-2]) for item in row])[::-1]
# We add the new values sorted
data.append(row.iloc[indices_max].values.tolist())
# We create the new dataframe with values sorted
df = pd.DataFrame(data, columns = [f"max={i}" for i in range(1, len(df1.columns) 1)])
df.index = df1.index
print(df)
Here is the result:
max=1 max=2 max=3
aa 4.12=a 3.0=b 2.0=c
bb 5.123123=c 1.0123123=a -6.12312=b
cc 6.12313=c -3.123123=a -8.512323=b