Home > Blockchain >  Convert DataFrame column from string to datetime for format "January 1, 2001 Monday"
Convert DataFrame column from string to datetime for format "January 1, 2001 Monday"

Time:01-27

I am trying to convert a dataframe column "date" from string to datetime. I have this format: "January 1, 2001 Monday".

I tried to use the following:

from dateutil import parser
for index,v in df['date'].items():
   df['date'][index] = parser.parse(df['date'][index])  

But it gives me the following error:

ValueError: Cannot set non-string value '2001-01-01 00:00:00' into a StringArray.

I checked the datatype of the column "date" and it tells me string type.

This is the snippet of the dataframe:

enter image description here

Any help would be most appreciated!

CodePudding user response:

why don't you try this instead of dateutils, pandas offer much simpler tools such as pd.to_datetime function:

df['date'] = pd.to_datetime(df['date'], format='%B %d, %Y %A')

CodePudding user response:

You need to specify the format for the datetime object in order it to be parsed correctly. The documentation helps with this:

  • %A is for Weekday as locale’s full name, e.g., Monday
  • %B is for Month as locale’s full name, e.g., January
  • %d is for Day of the month as a zero-padded decimal number.
  • %Y is for Year with century as a decimal number, e.g., 2021.

Combining all of them we have the following function:

from datetime import datetime

def mdy_to_ymd(d):
    return datetime.strptime(d, '%B %d, %Y %A').strftime('%Y-%m-%d')
    
print(mdy_to_ymd('January 1, 2021 Monday'))


> 2021-01-01

One more thing is for your case, .apply() will work faster, thus the code is:

df['date'] = df['date'].apply(lambda x: mdy_to_ymd)

Feel free to add Hour-Minute-Second if needed.

  • Related