Home > Software engineering >  Unable to convert object to Date(Y-m-d) format in Python
Unable to convert object to Date(Y-m-d) format in Python

Time:12-16

I am aware there are multiple answers for string/object to date time conversion. I have tried most of them, but still not able to get the desired result.

I have date in format 2024-08-01 00:00:00.0000000 and I want only the date part 2024-08-01 format.

My dataframe is in format: Date is of type object

Date 
2024-08-01 00:00:00.0000000 
2024-09-01 00:00:00.0000000

Using the answers provided in stackoverflow, I performed:

from dateutil.parser import parse
df['DATE'] = df['DATE'].apply(lambda x : parse(x)) #This will give in format 2024-08-01 00:00:00 of dtype datetime.

Then I use strftime to convert it into %Y-%m-%d format.

def tz_datetime(date_tz):
    date_obj = dt.datetime.strftime(date_tz, '%Y-%m-%d') 
    return date_obj
df['DATE'] = df['DATE'].apply(tz_datetime)

My DATE column is of object dtype now. df.head() gives me:

DATE
2024-08-01
2024-09-01

Then I use pd.to_datetime to convert it into datetime object.

df['DATE'] = pd.to_datetime(df['DATE'], format="%Y-%m-%d")

I also use floor() option to get the date in required date format.

df['DATE'] = df['DATE'].dt.floor('d')

Now the DATE dtype in datetime64[ns]. and df.head() also shows the required format.

But, when I export it to csv format or I display(df) again my DATE column shows different format.

DATE
2024-10-01T00:00:00.000 0000
2024-08-01T00:00:00.000 0000
2024-07-01T00:00:00.000 0000
2024-06-01T00:00:00.000 0000
2017-10-01T00:00:00.000 0000

I have exhausted all option to get this date in "%Y-%m-%d" format. Can someone please explain me what am I doing wrong, why this date format shows correct when I do .head() and shows different/actual value of DATE column when I display() the dataframe.

CodePudding user response:

If you are not already using it the date_format parameter might be what you're missing.

df.to_csv(filename, date_format='%Y-%m-%d')

As you have demonstrated above, truncating the data to just 'Y-m-d' still gives the longer format 'Y-m-d H:M:S.f' (and an extra zero) when exporting because the date type hasn't changed. This means the output format for the date type hasn't changed either.

Also a side note: You want to avoid overwriting the data with strftime() just to get a different format as you may need the lost data in later analysis.

  • Related