Home > OS >  String to date in pandas
String to date in pandas

Time:01-04

I have a dataset with dates encoded as strings formatted as %B %d, %Y, eg September 10, 2021.

Using:df['sale_date'] = pd.to_datetime(df.sale_date, format = '%B %d, %Y')

produces this error ValueError: time data 'September 10, 2021' does not match format '%B %d, %Y' (match)

Manually checking with strptimedatetime.strptime('September 10, 2021', '%B %d, %Y') produces the correct datetime object.

Is there something I missed in the pd.to_datetime?

Thanks.

CodePudding user response:

Upon further investigation, I found out that the error only happens on the first element of the series. It seems that the string has '\ufeff' added to it. So I just did a series.str.replace() and now it is working. Sorry for the bother. Question is how did that BOM end up there?

CodePudding user response:

Very likely you have to eliminate some whitespaces first!

If I add whitespaces at the beginning, end or both..

datestring = ' September 10, 2021 '

datetime.datetime.strptime(datestring, '%B %d, %Y')

it will result in the same error message as you have..

ValueError: time data ' September 10, 2021 ' does not match format '%B %d, %Y'

As a solution for a single value use:

datestring = ' September 10, 2021 '
datestring.strip()

for a column in a dataframe use:

dummy = pd.DataFrame(columns={'Date'}, data = [' September 10, 2021 ', ' September 11, 2021 ', ' September 12, 2021 '])
dummy.Date = dummy.Date.apply(lambda x: x.strip())
  •  Tags:  
  • Related