I have a dataset where the date_time column contains a mixture of BST and GMT date and times, in the following format 'Sun 27 Mar 2022 12:59:03 AM GMT'. I would like to convert this whole column into the following format '2022-03-27 00:59:03', and into one standard time zone, such as UTC. Where am I going wrong??
I tried using strptime() to convert the a tester date, but that only works with GMT times.
date_time_str = 'Sun 27 Mar 2022 12:59:03 AM GMT'
date_time_obj = datetime.datetime.strptime(date_time_str, '%a %d %b %Y %I:%M:%S %p %Z')
print('Date-time:', date_time_obj)
Date-time: 2022-03-27 00:59:03
Then tried with BST
date_time_str_bst = 'Sun 27 Mar 2022 02:00:02 AM BST'
date_time_obj_bst = datetime.datetime.strptime(date_time_str_bst, '%a %d %b %Y %I:%M:%S %p %Z')
ValueError: time data 'Sun 27 Mar 2022 02:00:02 AM BST' does not match format '%a %d %b %Y %I:%M:%S %p %Z'
CodePudding user response:
Here's an option how you can handle this. As commented, abbreviated tz names such as "BST" are ambiguous. You will have to define to which tz the abbreviations map; e.g. "Europe/London" for BST/GMT.
define timezone mapping {abbreviation --> IANA tz name}, apply parser, then convert to UTC
given
df
Date and time temp
0 Sun 27 Mar 2022 12:57:03 AM GMT 33.9
1 Sun 27 Mar 2022 12:58:02 AM GMT 33.6
2 Sun 27 Mar 2022 12:59:03 AM GMT 33.6
3 Sun 27 Mar 2022 02:00:02 AM BST 33.9
4 Sun 27 Mar 2022 02:01:03 AM BST 33.6
5 Sun 27 Mar 2022 02:02:02 AM BST 33.6
running
import pandas as pd
import dateutil
tzmapping = {"GMT": dateutil.tz.gettz("Europe/London"),
"BST": dateutil.tz.gettz("Europe/London")}
df["dt_UTC"] = df["Date and time"].apply(dateutil.parser.parse, tzinfos=tzmapping).dt.tz_convert("UTC")
gives
df
Date and time temp dt_UTC
0 Sun 27 Mar 2022 12:57:03 AM GMT 33.9 2022-03-27 00:57:03 00:00
1 Sun 27 Mar 2022 12:58:02 AM GMT 33.6 2022-03-27 00:58:02 00:00
2 Sun 27 Mar 2022 12:59:03 AM GMT 33.6 2022-03-27 00:59:03 00:00
3 Sun 27 Mar 2022 02:00:02 AM BST 33.9 2022-03-27 01:00:02 00:00
4 Sun 27 Mar 2022 02:01:03 AM BST 33.6 2022-03-27 01:01:03 00:00
5 Sun 27 Mar 2022 02:02:02 AM BST 33.6 2022-03-27 01:02:02 00:00