Home > Software design >  Python If Statement: Running different formula based on text in a Pandas dataframe column
Python If Statement: Running different formula based on text in a Pandas dataframe column

Time:10-23

I have a dataframe like below:

cusip       fmcSecType  effectiveMaturity   weightedAverageMaturity
683244AM9   OABS        11/26/2029          3/23/2026
975014AC5   OABS        11/20/2040          3/31/2033
44107EAA7   OABS        1/31/2045           12/27/2035
140006AA5   OABS        3/31/2046           5/21/2036
35086RAA1   OABS        6/23/2045           12/26/2031
00287XAA9   AABS        12/31/2043          9/28/2034
004331AA2   OABS        8/31/2039           2/12/2032
00433JAA3   OABS        9/30/2037           12/24/2030
00434CAC3   OABS        12/31/2042          1/28/2035
09852URA3   NPTB        4/1/2022            NaN
0985ZUMQ7   NPTB        12/23/2021          NaN
74800JY41   NPTB        11/4/2021           NaN
44051F10 J  EETF        5/6/2031            NaN
44051H10 J  EETF        1/16/2023           NaN
SF00000132  BSNS        7/5/2022            NaN
SV00000132  BSNS        12/30/2021          NaN

What I would like to do is to calculate the term (from a set date)

enddate = date(2021, 10, 15)

But calculation parameter differ for different fmcSecType

  • for fmcSecType == 'OABS' or 'AABS', use weightedAverageMaturity
  • for all else, use effectiveMaturity

So I tried below but it doesn't work:

if (metric_df['fmcSecType'] == 'AABS') | (metric_df['fmcSecType'] == 'OABS'):
    metric_df['ET_manual'] = (metric_df['weightedAverageMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
else:
    metric_df['ET_manual'] = (metric_df['effectiveMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days

Error code was:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

So I tried another method and still doesn't work, the calculation using weightedAverageMaturity just doesn't happen:

for i in metric_df['fmcSecType']:
    if i == 'AABS':
        metric_df['ET_manual'] = (metric_df['weightedAverageMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
    elif i == 'OABS':
        metric_df['ET_manual'] = (metric_df['weightedAverageMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
    else:
        metric_df['ET_manual'] = (metric_df['effectiveMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days

CodePudding user response:

Instead of using the Python if-else statement which does not support checking the boolean series of Pandas, you can setup a Pandas boolean mask by .isin() and then use .loc with the boolean mask to set up the new columns with different date columns, as follows:

For each of columns effectiveMaturity and weightedAverageMaturity, we convert them from date string to datetime format by pd.to_datetime() before the calculation of date differences.

from datetime import date, timedelta

enddate = date(2021, 10, 15)

# Setup boolean mask for checking whether fmcSecType is 'AABS' or 'OABS'
mask = metric_df['fmcSecType'].isin(['AABS', 'OABS'])  

# For rows with fmcSecType is 'AABS' or 'OABS', use `weightedAverageMaturity` as source date
metric_df.loc[mask, 'ET_manual'] = (pd.to_datetime(metric_df['weightedAverageMaturity'], format='%m/%d/%Y') - pd.Timestamp(enddate - timedelta(days = 1))).dt.days

# For rows with fmcSecType is neither 'AABS' nor 'OABS', use `effectiveMaturity` as source date
metric_df.loc[~mask, 'ET_manual'] = (pd.to_datetime(metric_df['effectiveMaturity'], format='%m/%d/%Y') - pd.Timestamp(enddate - timedelta(days = 1))).dt.days

Result:

print(metric_df)


         cusip fmcSecType effectiveMaturity weightedAverageMaturity  ET_manual
0    683244AM9       OABS        11/26/2029               3/23/2026     1621.0
1    975014AC5       OABS        11/20/2040               3/31/2033     4186.0
2    44107EAA7       OABS         1/31/2045              12/27/2035     5187.0
3    140006AA5       OABS         3/31/2046               5/21/2036     5333.0
4    35086RAA1       OABS         6/23/2045              12/26/2031     3725.0
5    00287XAA9       AABS        12/31/2043               9/28/2034     4732.0
6    004331AA2       OABS         8/31/2039               2/12/2032     3773.0
7    00433JAA3       OABS         9/30/2037              12/24/2030     3358.0
8    00434CAC3       OABS        12/31/2042               1/28/2035     4854.0
9    09852URA3       NPTB          4/1/2022                     NaN      169.0
10   0985ZUMQ7       NPTB        12/23/2021                     NaN       70.0
11   74800JY41       NPTB         11/4/2021                     NaN       21.0
12  44051F10 J       EETF          5/6/2031                     NaN     3491.0
13  44051H10 J       EETF         1/16/2023                     NaN      459.0
14  SF00000132       BSNS          7/5/2022                     NaN      264.0
15  SV00000132       BSNS        12/30/2021                     NaN       77.0
  • Related