I am trying to convert a dataframe column "date" from string
to datetime
. I have this format: "January 1, 2001 Monday".
I tried to use the following:
from dateutil import parser
for index,v in df['date'].items():
df['date'][index] = parser.parse(df['date'][index])
But it gives me the following error:
ValueError: Cannot set non-string value '2001-01-01 00:00:00' into a StringArray.
I checked the datatype of the column "date" and it tells me string
type.
This is the snippet of the dataframe:
Any help would be most appreciated!
CodePudding user response:
why don't you try this instead of dateutils
, pandas
offer much simpler tools such as pd.to_datetime
function:
df['date'] = pd.to_datetime(df['date'], format='%B %d, %Y %A')
CodePudding user response:
You need to specify the format for the datetime
object in order it to be parsed correctly. The documentation helps with this:
%A
is for Weekday as locale’s full name, e.g., Monday%B
is for Month as locale’s full name, e.g., January%d
is for Day of the month as a zero-padded decimal number.%Y
is for Year with century as a decimal number, e.g., 2021.
Combining all of them we have the following function:
from datetime import datetime
def mdy_to_ymd(d):
return datetime.strptime(d, '%B %d, %Y %A').strftime('%Y-%m-%d')
print(mdy_to_ymd('January 1, 2021 Monday'))
> 2021-01-01
One more thing is for your case, .apply()
will work faster, thus the code is:
df['date'] = df['date'].apply(lambda x: mdy_to_ymd)
Feel free to add Hour-Minute-Second if needed.