I have a dataframe of FEMA flood risk 2.0 data. Each column name indicates how much a home owner's monthly flood insurance premimums will change. Each row contains a float
representing the percentage of homes within that Zip Code will have their premimums adjusted according to the corresponding column name. Like so:
Col1 Col2 Col3 Col4
Zip Code < -$100 -$100 to -$90 -$90 to -$80
33010 .0194 .0036 .0040
33138 .0568 .0082 .0109
33141 .0101 .0008 .0019
I want to add a column where each row contains the name of the column that had the highest value for each row. My output should be:
Col1 Col2 Col3 Col4 Col5
Zip Code < -$100 -$100 to -$90 -$90 to -$80 common_price_change
33010 .0194 .0036 .0040 <-$100
33138 .0568 .1082 .0109 -$100 to -$90
33141 .0101 .0008 .0319 -$90 to -$80
CodePudding user response:
You could use the df.idxmax()
method to get what you want.
Here is an example:
import pandas as pd
df = pd.DataFrame({'zip_code':['zip1','zip2','zip3'],
'col1':[0,1,3],
'col2':[3,2,1],
'col3':[4,0,2]
})
df['max_col'] = df[['col1','col2','col3']].idxmax(axis=1)
df
The max_col
is using idxmax
to get the column name with the highest value.