How to get list of all columns corresponding to maximum value in each row in a Dataframe? For example, if I have this dataframe,
df = pd.DataFrame({'a':[12,34,98,26],'b':[12,87,98,12],'c':[11,23,43,1]})
a b c
0 12 12 11
1 34 87 23
2 98 98 43
3 26 12 1
I want to make another dataframe as shown below,
0 [a,b]
1 [b]
2 [a,b]
3 [a]
How can I do that?
CodePudding user response:
You could try these codes (bot approaches do the same):
max_cols_df = pd.DataFrame({"max_cols": [list(df[df==mv].iloc[i].dropna().index) for i, mv in enumerate(df.max(axis=1))]})
max_cols_df = pd.DataFrame({"max_cols": [list(df.iloc[i][v].index) for i, v in df.eq(df.max(axis=1), axis=0).iterrows()]})
max_cols_df
----------------
max_cols
0 [a, b]
1 [b]
2 [a, b]
3 [a]
----------------
But I think you asked a related question to another post and cannot upgrade pandas
. Therefore, the second could throw an error.
CodePudding user response:
Solved using Pandas Melt
import pandas as pd
df = pd.DataFrame({'a':[12,34,98,26],'b':[12,87,98,12],'c':[11,23,43,1]})
df = df.reset_index()
df = pd.melt(df, id_vars=['index'],value_vars=['a','b','c'], var_name='col_name')
df['Index_Value_Max'] = df.groupby(['index'])['value'].transform('max')
df[df['value'] == df['Index_Value_Max']].groupby(['index']).agg({'col_name':'unique'}).reset_index()
Output:
CodePudding user response:
First, get the max value in each row:
df.max(axis=1)
Then check the dataframe for values that equal max values.
df.eq(df.max(axis=1), axis=0)
Then apply a function row-wise to select for the column names where the value in the row is True
.
df.apply(
lambda row: [k for k, v in row.iteritems() if v],
axis=1
)
Summarily, this is:
df.eq(df.max(axis=1), axis=0).apply(
lambda row: [k for k, v in row.iteritems() if v],
axis=1
)
0 [a, b]
1 [b]
2 [a, b]
3 [a]
dtype: object