I have a pandas dataframe column named disbursal_date
which is a datetime
:
disbursal_date
2009-01-28
2008-01-03
2008-07-15
and so on...
I want to keep the date and month part and replace the years by 2022
for all values.
I tried using df['disbursal_date'].map(lambda x: x.replace(year=2022))
but this didn't work for me.
CodePudding user response:
- You need to use apply not map to run a python function on a dataframe columns.
- We need to make sure that the dtype is datetime of pandas and not object or string.
Below is the sample code I tried and it works fine, it replaces the year to 2022.
df = pd.DataFrame(['2009-01-28', '2008-01-03', '2008-07-15'],columns=['disbursal_old'])
df['disbursal_old'] = df['disbursal_old'].astype('datetime64[ns]')
df['disbursal_new'] = df['disbursal_old'].apply(lambda x : x.replace(year=2022))
print(df['disbursal_new'])
0 2022-01-28
1 2022-01-03
2 2022-07-15
Name: disbursal_new, dtype: datetime64[ns]
The below code gives the difference between the years.
df['disbursal_diff_year'] = df['disbursal_new'].dt.year - df['disbursal_old'].dt.year
print(df)
disbursal_old disbursal_new disbursal_diff_year
0 2009-01-28 2022-01-28 13
1 2008-01-03 2022-01-03 14
2 2008-07-15 2022-07-15 14