I am aware there are multiple answers for string/object to date time conversion. I have tried most of them, but still not able to get the desired result.
I have date in format 2024-08-01 00:00:00.0000000
and I want only the date part 2024-08-01
format.
My dataframe is in format: Date is of type object
Date
2024-08-01 00:00:00.0000000
2024-09-01 00:00:00.0000000
Using the answers provided in stackoverflow, I performed:
from dateutil.parser import parse
df['DATE'] = df['DATE'].apply(lambda x : parse(x)) #This will give in format 2024-08-01 00:00:00 of dtype datetime.
Then I use strftime
to convert it into %Y-%m-%d
format.
def tz_datetime(date_tz):
date_obj = dt.datetime.strftime(date_tz, '%Y-%m-%d')
return date_obj
df['DATE'] = df['DATE'].apply(tz_datetime)
My DATE column is of object dtype now.
df.head()
gives me:
DATE
2024-08-01
2024-09-01
Then I use pd.to_datetime
to convert it into datetime object.
df['DATE'] = pd.to_datetime(df['DATE'], format="%Y-%m-%d")
I also use floor()
option to get the date in required date format.
df['DATE'] = df['DATE'].dt.floor('d')
Now the DATE dtype in datetime64[ns].
and df.head()
also shows the required format.
But, when I export it to csv
format or I display(df)
again my DATE column shows different format.
DATE
2024-10-01T00:00:00.000 0000
2024-08-01T00:00:00.000 0000
2024-07-01T00:00:00.000 0000
2024-06-01T00:00:00.000 0000
2017-10-01T00:00:00.000 0000
I have exhausted all option to get this date in "%Y-%m-%d" format. Can someone please explain me what am I doing wrong, why this date format shows correct when I do .head()
and shows different/actual value of DATE column when I display()
the dataframe.
CodePudding user response:
If you are not already using it the date_format parameter might be what you're missing.
df.to_csv(filename, date_format='%Y-%m-%d')
As you have demonstrated above, truncating the data to just 'Y-m-d' still gives the longer format 'Y-m-d H:M:S.f' (and an extra zero) when exporting because the date type hasn't changed. This means the output format for the date type hasn't changed either.
Also a side note: You want to avoid overwriting the data with strftime()
just to get a different format as you may need the lost data in later analysis.