Home > database >  how to compare elements row and column based when we have elements as a list?
how to compare elements row and column based when we have elements as a list?

Time:11-05

I have a dataFrame and each column is a list with 3 elements. How I can compare the second element os each list based on each column and row based?

Col1   col2    col3
['A',0.2,5]  ['A',0.4,5]  ['A',0,5]
['A',0.4,5]  ['A',0.2,5]  ['A',0.7,5]
['A',0.1,5]  ['A',0.1,5]  ['A',0.20,5]
['A',0.25,5]  ['A',0.9,5]  ['A',0.22,5]

max of the second element in column based:

col1=0.4
col2=0.9
col3=0.22

max of the second element in row based:

 row1=0.2
row2=0.7
row3=0.20
row4=0.9

CodePudding user response:

Use applymap to extract the second element from the lists and then use axis to get the max in the columns or the rows:

new_df = df.applymap(lambda x: x[1])
# rows
new_df.max(axis=1)
# columns
new_df.max()

CodePudding user response:

For a vectorial solution, you can stack, slice, then groupby.max on the desired level.

For rows:

df.stack().str[1].groupby(level=0).max()

Output:

0    0.4
1    0.7
2    0.2
3    0.9
dtype: float64

For columns:

df.stack().str[1].groupby(level=1).max()

Output:

Col1    0.4
col2    0.9
col3    0.7
dtype: float64

CodePudding user response:

Another possible solution:

def mymax(x):
    return max([y[1] for y in x])

df.apply(mymax)
df.apply(mymax, axis=1)

EDIT

The following code tries to give an answer to the below OP comment:

def imax(x):
    return np.argmax([y[1] for y in x])

df.apply(imax)
df.apply(imax, axis=1)

# in case, we want the column names
df.columns[df.apply(imax, axis=1)].tolist()
  • Related