Home > database >  Get max value across subset of rows and compare to constant to return max in new column
Get max value across subset of rows and compare to constant to return max in new column

Time:11-22

I am trying to create a new column in a dataframe that is the maximum value across two columns or a constant value. Whichever is the largest value will be returned to the new column.

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'loan_num': ['111', '333', '555', '777'],
    'bllnterm': [0, 240, 360, 240],
    'amortterm': [0, 360, 360, 360]
})

I have tried using pd.clip, np.maximum, and np.amax but none seem to run without throwing an error.

df = df.assign(amtz = df[['bllnterm', 'amortterm']].clip(lower=1, axis=1))

This returns a ValueError: Wrong number of items passed 2, placement implies 1

df = df.assign(amtz = np.maximum(df[['bllnterm', 'amortterm']], 1))

This returns a ValueError: Wrong number of items passed 2, placement implies 1

df = df.assign(amtz = np.amax(df[['bllnterm', 'amortterm']], axis=1, initial=1))

This returns a TypeError: max() got an unexpected keyword argument 'initial'. However, initial is a keyword in the docs so I'm not sure what is going on there.

My desired output looks like this:

loan_num     bllnterm    amortterm     amtz
----------------------------------------------
111            0            0           1
333            240          360         360
555            360          360         360
777            240          360         360

CodePudding user response:

You were on the right track, you need to combine max and clip:

df['amtz'] = df[['bllnterm', 'amortterm']].max(axis=1).clip(lower=1)

As assign:

df.assign(amtz=df[['bllnterm', 'amortterm']].max(axis=1).clip(lower=1))

output:

  loan_num  bllnterm  amortterm  amtz
0      111         0          0     1
1      333       240        360   360
2      555       360        360   360
3      777       240        360   360
  • Related