Home > Mobile >  Pandas: Apply function with a parameter on a dataframe cell based on condition from other cell value
Pandas: Apply function with a parameter on a dataframe cell based on condition from other cell value

Time:08-30

i want to apply a function on each cell value of a column in my dataframe based on a condition from values in other column.

in example dataframe below, i want Field3(currently showing *) column to have values as returned from my function to_Year_Maturity based on Instrument value passed. Please note the data frame has 50k records in production and deriving return of function is a expensive database call based on Instrument passed. I was trying below pandas line but when i hard code to [0] it returns all same value from 0th instrument , so 5Y in this case. My requirement is to be able to pass for each instrument and replace Fields3 for that row , in that way for whole table. Thanks in advance.

CCY,Field0,Field1,Field3,Scenario,Instrument
USD,CSW,.03456,*,CSW,ABC
USD,SPRD,.00238,*,DV1,GFH
USD,IR0,.0098,*,IR1,WQD
USD,IRF0,.256,*,IRF,POU
USD,CSW,.9876,*,CSW,LKY
USD,CSW,.12678,*,CSW,BVF
USD,SPRD,.0072,*,DV1,FCD
USD,IR0,.03248,*,IR1,LKS
USD,IRF0,.0963,*,IRF,ZSX
USD,CSW,.5632,*,CSW,BNH
USD,SPRD,.9812,*,DV1,NBV
USD,IR0,.0098,*,IR1,MKN
USD,IRF0,.256,*,IRF,CVR

SCENARIO_LIST = ['CSW', 'DV1']

def to_Year_Maturity( instrument):
    return {'ABC':'5Y', 'GFH':'1Y3M', 'WQD':'6Y9M', 'POU':'10Y6M', 'LKY':'9M', 'BVF':'3Y3M', 'FCD':'6Y9M', 'LKS':'2Y9M', 'ZSX':'12Y3M', 'BNH':'7Y6M', 'NBV':'20Y9M', 'MKN':'6Y9M', 'CVR':'20Y9M'}.get(instrument,'99Y')

My attempts

view['Field3'] = np.where(view['Scenario'].isin(SCENARIO_LIST), to_Year_Maturity(view['Instrument'][0]), view['Field3'])

above return 5Y for all rows matching condition of SCENARIO_LIST. also tried below but throws error of

TypeError: unhashable type: 'Series'

view['Field3'] = np.where(view['Scenario'].isin(SCENARIO_LIST), to_Year_Maturity(view['Instrument'].astype(str)), view['Field3'])

My expected output should be like

CCY,Field0,Field1,Field3,Scenario,Instrument
USD,CSW,.03456,5Y,CSW,ABC
USD,SPRD,.00238,1Y3M,DV1,GFH
USD,IR0,.0098,*,IR1,WQD
USD,IRF0,.256,*,IRF,POU
USD,CSW,.9876,9M,CSW,LKY
USD,CSW,.12678,3Y3M,CSW,BVF
USD,SPRD,.0072,6Y9M,DV1,FCD
USD,IR0,.03248,*,IR1,LKS
USD,IRF0,.0963,*,IRF,ZSX
USD,CSW,.5632,7Y6M,CSW,BNH
USD,SPRD,.9812,20Y9M,DV1,NBV
USD,IR0,.0098,*,IR1,MKN
USD,IRF0,.256,*,IRF,CVR

CodePudding user response:

You can achieve this as follows. Given np.where(condition, [x, y, ]/), you want x to consist of pd.Series.map applied to df['Instrument'] with your function passed. Now, the associated value for all key matches will be yielded, whenever condition == True. So:

import pandas as pd
import numpy as np

SCENARIO_LIST = ['CSW', 'DV1']

def to_Year_Maturity( instrument):
    return {'ABC':'5Y', 'GFH':'1Y3M', 'WQD':'6Y9M', 'POU':'10Y6M', 
            'LKY':'9M', 'BVF':'3Y3M', 'FCD':'6Y9M', 'LKS':'2Y9M', 
            'ZSX':'12Y3M', 'BNH':'7Y6M', 'NBV':'20Y9M', 'MKN':'6Y9M', 
            'CVR':'20Y9M'}.get(instrument,'99Y')

df['Field3'] = np.where(df['Scenario'].isin(SCENARIO_LIST),
                        df['Instrument'].map(to_Year_Maturity),
                        df['Field3'])

print(df)

    CCY Field0   Field1 Field3 Scenario Instrument
0   USD    CSW  0.03456     5Y      CSW        ABC
1   USD   SPRD  0.00238   1Y3M      DV1        GFH
2   USD    IR0  0.00980      *      IR1        WQD
3   USD   IRF0  0.25600      *      IRF        POU
4   USD    CSW  0.98760     9M      CSW        LKY
5   USD    CSW  0.12678   3Y3M      CSW        BVF
6   USD   SPRD  0.00720   6Y9M      DV1        FCD
7   USD    IR0  0.03248      *      IR1        LKS
8   USD   IRF0  0.09630      *      IRF        ZSX
9   USD    CSW  0.56320   7Y6M      CSW        BNH
10  USD   SPRD  0.98120  20Y9M      DV1        NBV
11  USD    IR0  0.00980      *      IR1        MKN
12  USD   IRF0  0.25600      *      IRF        CVR

Also possible:

df['Field3'] = df.loc[df['Scenario'].isin(SCENARIO_LIST),'Instrument']\
    .map(to_Year_Maturity)

# reset all NaNs to asterisks
df = df.fillna('*')
  • Related