Example df of how my data is stored:
df = pd.DataFrame({'DOB': {0: '2003-01-01 00:00:00 00', 1: '2003-01-02 00:00:00 00'}})
I want to convert the column to datetime, and only keep the Year-Month-Day values. Basically, exactly as this question:
Removing the timestamp from a datetime in pandas dataframe
and this one:
Convert column of date objects in Pandas DataFrame to strings
Except using these answer, extracting the dates from the column which is already a Datetime format converts the column to an Object format. This is also discussed in the comments, but seems to have been moved to chat which isn't available anymore.
Result in first answer:
[in] pd.to_datetime(df['DOB']).dt.date
[out]
0 2003-01-01
1 2003-01-02
Name: DOB, dtype: object
Result of second answer:
[in] pd.to_datetime(df['DOB']).dt.strftime('%Y-%m-%d')
[out]
0 2003-01-01
1 2003-01-02
Name: DOB, dtype: object
As you can see, it's not a Datetime format anymore but an object. Then I can't use it furthermore in my script.
Using the format = "%Y-%m-%d
option of pd.to_datetime doesn't work too, as the timestamp is still present:
[in] pd.to_datetime(df['DOB'], format = "%Y-%m-%d")
[out]
0 2003-01-01 00:00:00 00:00
1 2003-01-02 00:00:00 00:00
Name: DOB, dtype: datetime64[ns, UTC]
What is a simple solution to make sure the column stays in a datetime format, while only keeping the date as format?
CodePudding user response:
IMO there is no issue here:
s = pd.to_datetime(pd.Series(['2021-02-01 00:00:00']))
s
# 0 2021-02-01
# dtype: datetime64[ns]
And indeed, the displayed type is "object":
s.dt.date
# 0 2021-02-01
# dtype: object
But this doesn't mean much, the type is really datetime.date
:
type(s.dt.date[0])
# datetime.date
CodePudding user response:
I have it working with two extra lines, but was hoping there would be a simpler solution to this issue:
df['DOB'] = pd.to_datetime(df['DOB'])
df['DOB'] = DF['DOB'].dt.date
df['DOB'] = pd.to_datetime(df['DOB'])
# gives
[in] df
[out]
0 2003-01-01
1 2003-01-02
Name: DOB, dtype: datetime64[ns]