Home > Mobile >  pd.to_datetime convert non date strings to date
pd.to_datetime convert non date strings to date

Time:08-30

In dataframe I have column "_date_" . This column contains lots of strings, some of which are dates. I'm trying to convert the strings that contain dates into datetimes.

I have used pd.to_datetime with errors='coerce', however part of string cells are also converted to date, for example "--70--" is converted to "2070-01-01"

How can I prevent pandas to not convert not date strings to date.

df = pd.DataFrame(
    {'_date_': ['22 Jun 2023', '123', '--99--', "--70--", "b", "c", "d"]})

df["_date_"] = pd.to_datetime(df['_date_'], errors='coerce')
df['_date_']

0   2023-06-22
1          NaT
2   1999-01-01
3   2070-01-01
4          NaT
5          NaT
6          NaT
Name: _date_, dtype: datetime64[ns]

CodePudding user response:

If you have a specific date format, then be specific:

pd.to_datetime(df['_date_'], format='%d %b %Y', errors='coerce')

Else, you can replace a specific unwanted character (here - on start of string, but not in the middle to enable correct parsing of something like 2022-08-29) to ensure the date will be invalid:

pd.to_datetime(df['_date_'].str.replace('^-', 'invalid', regex=True),
               errors='coerce')

output:

0   2023-06-22
1          NaT
2          NaT
3          NaT
4          NaT
5          NaT
6          NaT
Name: _date_, dtype: datetime64[ns]
  • Related