I have the following pandas dataframe
I want to subtract date_of_birth from date_of_death to make a new column, "years_lived" to contain the years lived. I tried all 3 ways below (individually of course)
df['years_lived'] = (df['date_of_death'] - df['date_of_birth']).dt.days
df['years_lived'] = df['date_of_death'].sub(df['date_of_birth'], axis=0)
df['years_lived'] = df['date_of_death'] - df['date_of_birth']
but I got a TypeError: unsupported operand type(s) for -: 'str' and 'str'
CodePudding user response:
df['years_lived'] = pd.to_datetime(df['date_of_death']) - pd.to_datetime(df['date_of_birth'])
CodePudding user response:
pd.to_datetime will convert your dates into datetime and you can subtract one from the other.
https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html
CodePudding user response:
You need to convert str represenation of dates to dates before subracting.
df['years_lived'] = (df['date_of_death'].astype(dt.timedelta) - df['date_of_birth'].astype(dt.timedelta)).dt.days
or
if deosn't work convert before and then subract
df['date_of_death'] = pd.to_numeric(df['date_of_death'], errors='coerce').fillna(0).astype(int)
df['date_of_birth'] = pd.to_numeric(df['date_of_birth'], errors='coerce').fillna(0).astype(int)
df['years_lived'] = (df['date_of_death'] - df['date_of_birth']).dt.days