Home > Blockchain >  Add column name to row based on max value within that row in pandas, python
Add column name to row based on max value within that row in pandas, python

Time:11-18

I have a dataframe of FEMA flood risk 2.0 data. Each column name indicates how much a home owner's monthly flood insurance premimums will change. Each row contains a float representing the percentage of homes within that Zip Code will have their premimums adjusted according to the corresponding column name. Like so:

Col1       Col2               Col3                   Col4
Zip Code   < -$100        -$100 to -$90          -$90 to -$80
33010      .0194              .0036                  .0040 
33138      .0568              .0082                  .0109
33141      .0101              .0008                  .0019

I want to add a column where each row contains the name of the column that had the highest value for each row. My output should be:

 Col1       Col2               Col3                   Col4        Col5
Zip Code   < -$100        -$100 to -$90          -$90 to -$80    common_price_change
33010      .0194              .0036                  .0040        <-$100
33138      .0568              .1082                  .0109        -$100 to -$90
33141      .0101              .0008                  .0319        -$90 to -$80

CodePudding user response:

You could use the df.idxmax() method to get what you want.

Here is an example:

import pandas as pd

df = pd.DataFrame({'zip_code':['zip1','zip2','zip3'],
              'col1':[0,1,3],
              'col2':[3,2,1],
              'col3':[4,0,2]
              })

df['max_col'] = df[['col1','col2','col3']].idxmax(axis=1)
df

The max_col is using idxmax to get the column name with the highest value.

  • Related