I'm trying to convert all the dates in a pandas column from object to date. The object column is called PRODUCT_CREATION_DATE, and has values like:
2021-12-09 00:00:00-05:00
2021-04-20 00:00:00-07:00
2017-04-25 00:00:00-07:00
I already know I can convert them with pd.to_datetime, but I don't seem to be able to find the correct format, and I keep getting a format error. The problem is the last part, I have no idea what those -05:00 or -07:00 should represent. I actually only care about year, month and day, so if there's a way to get a date with only year, month and day, it would be even better....
This is the code I'm trying to use:
pd.to_datetime(df_info['PRODUCT_CREATION_DATE'], format='%Y-%m-%d %H:%M:%S-%f')
This returns
ValueError: unconverted data remains: :00
CodePudding user response:
The code %f
is used for microseconds, whereas in your sample dates, -05:00
refers to the UTC timezone offset. So, the format you need to use with to_datetime
will be
%Y-%m-%d %H:%M:%S%z
CodePudding user response:
This will get you only the yyyy-mm-dd format as a datetime you were looking for
df['Date'] = pd.to_datetime(df['Date'].apply(lambda x : x.split(' ')[0]))
CodePudding user response:
If you just need the date, let's just extract the date:
col1
0 2021-12-09 00:00:00-05:00
1 2021-04-20 00:00:00-07:00
2 2017-04-25 00:00:00-07:00
df['col1'] = pd.to_datetime(df['col1'].str[:10])
print(df)
print(df.dtypes)
col1
0 2021-12-09
1 2021-04-20
2 2017-04-25
col1 datetime64[ns]
dtype: object