I have a column creation_date
in my pandas dataframe as type str that has the following format:
2020-02-06 11:35:17 00:00
I am trying to create a new column in my dataframe titled days_since_creation
from datetime import date
today = date.today()
df['days_since_creation'] = date.strptime(df['creation_date'],'%m/%d/%Y') - today
My code is not correct and i'm not sure where I am going wrong. any help is much appreciated
CodePudding user response:
Just tested this out and works alright for me, apologies for the error from my comment.
df['days_since_creation'] = (pd.to_datetime(df['creation_date']).dt.date -
pd.Timestamp.today().date())
Result w just that one sample:
creation_date days_since_creation
0 2020-02-06 11:35:17 00:00 -646 days
Edit: if the granularity is important and you want to specify how to handle the timezone information, do something like this instead:
df['days_since_creation'] = (pd.to_datetime(df['creation_date']).dt.tz_localize(None) -
pd.Timestamp.today())
Result of that:
creation_date days_since_creation
0 2020-02-06 11:35:17 00:00 -647 days 23:26:51.569694
CodePudding user response:
this will work
df['days_since_creation'] = pd.to_datetime(df['creation_date'])
df['days_since_creation'] = today - df.days_since_creation.dt.date
if you want just integer of no of days
df['days_since_creation'] = (today - df.days_since_creation.dt.date ).dt.days