I have a dataframe like below:
cusip fmcSecType effectiveMaturity weightedAverageMaturity
683244AM9 OABS 11/26/2029 3/23/2026
975014AC5 OABS 11/20/2040 3/31/2033
44107EAA7 OABS 1/31/2045 12/27/2035
140006AA5 OABS 3/31/2046 5/21/2036
35086RAA1 OABS 6/23/2045 12/26/2031
00287XAA9 AABS 12/31/2043 9/28/2034
004331AA2 OABS 8/31/2039 2/12/2032
00433JAA3 OABS 9/30/2037 12/24/2030
00434CAC3 OABS 12/31/2042 1/28/2035
09852URA3 NPTB 4/1/2022 NaN
0985ZUMQ7 NPTB 12/23/2021 NaN
74800JY41 NPTB 11/4/2021 NaN
44051F10 J EETF 5/6/2031 NaN
44051H10 J EETF 1/16/2023 NaN
SF00000132 BSNS 7/5/2022 NaN
SV00000132 BSNS 12/30/2021 NaN
What I would like to do is to calculate the term (from a set date)
enddate = date(2021, 10, 15)
But calculation parameter differ for different fmcSecType
- for fmcSecType == 'OABS' or 'AABS', use weightedAverageMaturity
- for all else, use effectiveMaturity
So I tried below but it doesn't work:
if (metric_df['fmcSecType'] == 'AABS') | (metric_df['fmcSecType'] == 'OABS'):
metric_df['ET_manual'] = (metric_df['weightedAverageMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
else:
metric_df['ET_manual'] = (metric_df['effectiveMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
Error code was:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
So I tried another method and still doesn't work, the calculation using weightedAverageMaturity just doesn't happen:
for i in metric_df['fmcSecType']:
if i == 'AABS':
metric_df['ET_manual'] = (metric_df['weightedAverageMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
elif i == 'OABS':
metric_df['ET_manual'] = (metric_df['weightedAverageMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
else:
metric_df['ET_manual'] = (metric_df['effectiveMaturity'] - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
CodePudding user response:
Instead of using the Python if-else
statement which does not support checking the boolean series of Pandas, you can setup a Pandas boolean mask by .isin()
and then use .loc
with the boolean mask to set up the new columns with different date columns, as follows:
For each of columns effectiveMaturity
and weightedAverageMaturity
, we convert them from date string to datetime format by pd.to_datetime()
before the calculation of date differences.
from datetime import date, timedelta
enddate = date(2021, 10, 15)
# Setup boolean mask for checking whether fmcSecType is 'AABS' or 'OABS'
mask = metric_df['fmcSecType'].isin(['AABS', 'OABS'])
# For rows with fmcSecType is 'AABS' or 'OABS', use `weightedAverageMaturity` as source date
metric_df.loc[mask, 'ET_manual'] = (pd.to_datetime(metric_df['weightedAverageMaturity'], format='%m/%d/%Y') - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
# For rows with fmcSecType is neither 'AABS' nor 'OABS', use `effectiveMaturity` as source date
metric_df.loc[~mask, 'ET_manual'] = (pd.to_datetime(metric_df['effectiveMaturity'], format='%m/%d/%Y') - pd.Timestamp(enddate - timedelta(days = 1))).dt.days
Result:
print(metric_df)
cusip fmcSecType effectiveMaturity weightedAverageMaturity ET_manual
0 683244AM9 OABS 11/26/2029 3/23/2026 1621.0
1 975014AC5 OABS 11/20/2040 3/31/2033 4186.0
2 44107EAA7 OABS 1/31/2045 12/27/2035 5187.0
3 140006AA5 OABS 3/31/2046 5/21/2036 5333.0
4 35086RAA1 OABS 6/23/2045 12/26/2031 3725.0
5 00287XAA9 AABS 12/31/2043 9/28/2034 4732.0
6 004331AA2 OABS 8/31/2039 2/12/2032 3773.0
7 00433JAA3 OABS 9/30/2037 12/24/2030 3358.0
8 00434CAC3 OABS 12/31/2042 1/28/2035 4854.0
9 09852URA3 NPTB 4/1/2022 NaN 169.0
10 0985ZUMQ7 NPTB 12/23/2021 NaN 70.0
11 74800JY41 NPTB 11/4/2021 NaN 21.0
12 44051F10 J EETF 5/6/2031 NaN 3491.0
13 44051H10 J EETF 1/16/2023 NaN 459.0
14 SF00000132 BSNS 7/5/2022 NaN 264.0
15 SV00000132 BSNS 12/30/2021 NaN 77.0