I have a similar question on date conversion using data frame.
My data frame has two date string columns, hiredate
and end_date
, having different date formats, DD-MMM-YYYY
and MMM-YY
, respectively.
The column end_date
has no DD
to represent the day, so it should be added and set to the last day of the month.
Below is an example of the input data frame:
empno ename hiredate end_date
1 sreenu 17-Jun-2021 May-22
And this is the expected output after end_date
conversion:
empno ename hiredate end_date
1 sreenu 17-Jun-2021 2022-05-31
I want to create a method/function to identify all date columns in
the data frame and then convert it to YYYY-MM-DD
format.
CodePudding user response:
You can convert end_date
to datetime according to the format '%B-%y'
and add a MonthEnd(0)
offset:
Input data:
>>> df
empno ename hiredate end_date
0 1 sreenu 17-Jun-2021 May-22
Column transformation:
>>> df['end_date'] = pd.to_datetime(
df['end_date'], format='%b-%y'
).add(pd.offsets.MonthEnd(0))
Output result:
>>> df
empno ename hiredate end_date
0 1 sreenu 17-Jun-2021 2022-05-31
CodePudding user response:
You can use the solution pointed out in the comment -
pd.to_datetime(df['end_date'], format='%b-%y').apply(lambda x: date(x.year, x.month, calendar.monthrange(x.year, x.month)[-1]))