I am facing a little problem with time data in a dataframe. I am trying to convert mixed formats with single digits and slashes (2/19/22) as well as double digits with hyphen (2020-02-19) to zero padded data with point (19.02.2022).
Data:
d = {'Date':['24.06.2022','2022-01-23', '7/2/22', '9/23/18', '23.05.2022'],
'Flight': ['Rome', 'Zurich', 'Dallas', 'Madrid', 'Stockholm']}
df = pd.DataFrame(data=d, index=[0, 1, 2, 3, 4])
My code:
def autoconvert_datetime(value):
formats = ['%-m/%-d/%y', '%Y-%m-%d']
result_format = '%d.%m.%Y'
for dt_format in formats:
try:
dt_obj = datetime.strptime(value, dt_format)
return dt_obj.strftime(result_format)
except Exception as e: # throws exception when format doesn't match
pass
return value
df['Date'] = df['Date'].apply(autoconvert_datetime)
For other datetime formats it worked without any problem.
CodePudding user response:
Use pd.to_datetime
and let Pandas infers the format:
df['Date2'] = pd.to_datetime(df['Date'])
print(df)
# Output
Date Flight Date2
0 24.06.2022 Rome 2022-06-24
1 2022-01-23 Zurich 2022-01-23
2 7/2/22 Dallas 2022-07-02
3 9/23/18 Madrid 2018-09-23
4 23.05.2022 Stockholm 2022-05-23
If you want more control, you can directly use dateutil.parser
used by Pandas.