I am working on a stock market analysis where I look at past Balance Sheets and income statements, and want to change the date column which saves them as a string of the form "2021-09-30" into datetimes. I am trying to use pd.to_datetime but it is giving me an error.
When I run
df['datekey'] = pd.to_datetime(df['datekey'], format='%Y-%m-%d')
I get
"ValueError: time data "2021-09-30" doesn't match format specified"
when it should (if I am doing this correctly).
This column doesn't have a time value in it. It is just (for all dates) "2021-09-30".
CodePudding user response:
Seems like the value itself is enclosed by double quotes, you need to include quotes as well in your formats:
df['datekey'] = pd.to_datetime(df['datekey'], format='"%Y-%m-%d"')
Alternatively, you can strip off the quotes before converting to datetime, this is useful if some values are not enclosed by double quotes:
df['datekey'] = pd.to_datetime(df['datekey'].str.strip('"'), format='%Y-%m-%d')
CodePudding user response:
You have extra quotes and spaces in your data. Try:
df["datekey"] = pd.to_datetime(df["datekey"].str.replace(" ","").str.strip('"'), format="%Y-%m-%d")
>>> df["datekey"]
0 2021-09-30
1 2021-06-30
2 2021-03-31
3 2020-12-31
4 2020-09-30
5 2020-06-30
6 2020-03-31
7 2019-12-31
8 2019-09-30
9 2019-06-30
Name: datekey, dtype: datetime64[ns]