I have 2 columns named "deleted_at" (date the client canceled the service) and "birth_date" (of a client). I already transformed the string into date type using pd.to_datetime. Example:
mydataframe.loc[:,'deleted_at'] = pd.to_datetime(mydataframe['deleted_at']).
I am trying to calculate the age of the client using:
mydataframe [age] = (mydataframe['deleted_at'] - mydataframe['birth_date'])/365.25
The problem is this code is returning the date in days and I would like to receive in years. Example:
[deleted_at] 2018-06-17 23:59:08 - [birth_date] 1966-01-25
Is returning: 52 days 09:27:43.101984942. Should return 52 years, x months, y days, or even just 52 years. Is there an easy way to do that?
CodePudding user response:
this solution is much easier:
mydataframe[age] = (mydataframe['deleted_at'].dt.year - mydataframe['birth_date'].dt.year)
you can call day, week, month and year
CodePudding user response:
You can use relativedelta
from dateutil
package (already installed with pandas)
from dateutil.relativedelta import relativedelta
cols = ['deleted_at', 'birth_date']
df['relativedelta'] = df[cols].apply(lambda x: relativedelta(x[0], x[1]), axis=1)
Output:
deleted_at | birth_date | relativedelta |
---|---|---|
2018-06-17 23:59:08 | 1966-01-25 00:00:00 | relativedelta(years= 52, months= 4, days= 23, hours= 23, minutes= 59, seconds= 8) |
Setup:
import pandas as pd
df = pd.DataFrame({'deleted_at': [pd.Timestamp('2018-06-17 23:59:08')],
'birth_date': [pd.Timestamp('1966-01-25')]})