Home > other >  Convert dd/mm/yy to dd/mm/yyyy correctly
Convert dd/mm/yy to dd/mm/yyyy correctly

Time:08-25

I am facing issues when converting a date from dd/mm/yy to dd/mm/yyyy with pandas.

I have tried this:

df['Date'] = pd.to_datetime(df['Date'])

But somehow for some values the day and month are swapping... Is there another way to convert correctly?

Thanks in advance for answering me :)

CodePudding user response:

Short Version

You can specify the format but that doesn't mean there won't be any problems due to the Y2K bug inherent in two-digit years

df['Date'] = pd.to_datetime(df['Date'],format='%d/%m/%y')

The string '10/12/68' will be parsed as December 10, 2068.

If possible, get whoever creates this value to use the unambiguous ISO8601 format, ie YYYY-MM-DD

Long Explanation

The correct way is to simply not use such strings. There's no way even a human could guess what xx/xx/xx means. What does 12/5/12 mean? December 5 or May 12? 1912 or 2012? How about 10/12/68?

This isn't nitpicking. Such bugs are still causing serious problems even now:

The real fix is to modify the application that generates such dates to use the unambiguous YYYY-MM-DD format, or use a source that has strongly-typed dates. For example Excel and almost all databases have date types.

Workarounds

When that's not possible, you need to specify the date pattern using the format parameter.

df['Date'] = pd.to_datetime(df['Date'],format='%d/%m/%y')

There's still a chance for errors though, because the two-digit year will get translated based on a cutoff rule.

For example in this SO question '10/12/68' was parsed as 2068-12-10 instead of 1968:

>>> datetime.strptime('10/12/68', '%d/%m/%y')
datetime.datetime(2068, 12, 10, 0, 0)

Oops

  • Related