I have a column of various dates and wanted to add another column to the dataframe that shows the difference between the dates in the column versus a date in various that I have set.
enddate = date(2021, 10, 15)
Would something like:
I tried below but did returns error:
from dateutil.relativedelta import relativedelta
metric_df['term'] = relativedelta(metric_df['maturity'], enddate).years
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
CodePudding user response:
Have you considered using datetime subtraction? If I have two dates or datetimes, I can subtract them to yield a timedelta
object, i.e.
start_date = date(2021,01,01)
end_date = date(2021,01,03)
delta = end_date - start_date
print(delta.days) # Should print 2
You can do this with columns too, yielding a column of timedelta
objects.
CodePudding user response:
relativedelta
doesn't give you fractional years afaik, but you can easily calculate them yourself:
import pandas as pd
import numpy as np
# dummy data
metric_df = pd.DataFrame({'maturity': ["9/22/2025", "11/10/2033", "3/1/2024"]})
# convert to datetime
metric_df['maturity'] = pd.to_datetime(metric_df['maturity'])
# difference in days
metric_df['term'] = (metric_df['maturity'] - pd.Timestamp("2021-10-15")).dt.days
# to fractional years
metric_df['term'] = np.where(metric_df['maturity'].dt.is_leap_year, metric_df['term']/366, metric_df['term']/365)
metric_df['term']
0 3.939726
1 12.079452
2 2.371585
Name: term, dtype: float64