I have a pandas column that has both PDT and PST datetime values. Example:
PDT/PST |
---|
2021-10-29 00:18:38 PDT |
2021-10-29 01:08:19 PDT |
2021-11-08 19:43:58 PST |
2021-11-08 19:56:01 PST |
I need to convert these into UTC time zone. Example:
UTC |
---|
2021-10-29 07:18:00 |
A simple answer is appreciated.
CodePudding user response:
Use to_datetime
with convert strings to datetimes by dateparser.parse
:
import dateparser
df['PDT/PST'] = pd.to_datetime(df['PDT/PST'].apply(dateparser.parse), utc=True)
print (df)
PDT/PST
0 2021-10-29 07:18:38 00:00
1 2021-10-29 08:08:19 00:00
2 2021-11-09 03:43:58 00:00
3 2021-11-09 03:56:01 00:00
Last add Series.dt.tz_localize
with None
:
df['PDT/PST'] = (pd.to_datetime(df['PDT/PST'].apply(dateparser.parse), utc=True)
.dt.tz_localize(None))
print (df)
PDT/PST
0 2021-10-29 07:18:38
1 2021-10-29 08:08:19
2 2021-11-09 03:43:58
3 2021-11-09 03:56:01
Solution with replace PST
and PDT
to -7
or -8
is:
df['PDT/PST'] = (pd.to_datetime(df['PDT/PST']
.replace({'PDT':'-07:00','PST':'-08:00'}, regex=True), utc=True)
.dt.tz_localize(None))
print (df)
PDT/PST
0 2021-10-29 07:18:38
1 2021-10-29 08:08:19
2 2021-11-09 03:43:58
3 2021-11-09 03:56:01
CodePudding user response:
Another option: dateutil's parser with tzinfos supplied; then convert to UTC.
import dateutil
pacific_tz = dateutil.tz.gettz("US/Pacific")
df['UTC'] = df['PDT/PST'].apply(dateutil.parser.parse,
tzinfos={'PST': pacific_tz,
'PDT': pacific_tz}).dt.tz_convert('UTC')
df['UTC']
0 2021-10-29 07:18:38 00:00
1 2021-10-29 08:08:19 00:00
2 2021-11-09 03:43:58 00:00
3 2021-11-09 03:56:01 00:00
Name: UTC, dtype: datetime64[ns, UTC]
Related: Python strptime() and timezones?
Now you could format to string with a certain format if desired, e.g.
df['UTC'].dt.strftime('%Y-%m-%d %H:%M:%S')
0 2021-10-29 07:18:38
1 2021-10-29 08:08:19
2 2021-11-09 03:43:58
3 2021-11-09 03:56:01
Name: UTC, dtype: object