Home > OS >  Converting PDT/PST timezone column to UTC timezone
Converting PDT/PST timezone column to UTC timezone

Time:12-03

I have a pandas column that has both PDT and PST datetime values. Example:

PDT/PST
2021-10-29 00:18:38 PDT
2021-10-29 01:08:19 PDT
2021-11-08 19:43:58 PST
2021-11-08 19:56:01 PST

I need to convert these into UTC time zone. Example:

UTC
2021-10-29 07:18:00

A simple answer is appreciated.

CodePudding user response:

Use to_datetime with convert strings to datetimes by dateparser.parse:

import dateparser

df['PDT/PST'] = pd.to_datetime(df['PDT/PST'].apply(dateparser.parse), utc=True)
print (df)
                    PDT/PST
0 2021-10-29 07:18:38 00:00
1 2021-10-29 08:08:19 00:00
2 2021-11-09 03:43:58 00:00
3 2021-11-09 03:56:01 00:00

Last add Series.dt.tz_localize with None:

df['PDT/PST'] = (pd.to_datetime(df['PDT/PST'].apply(dateparser.parse), utc=True)
                   .dt.tz_localize(None))
print (df)
              PDT/PST
0 2021-10-29 07:18:38
1 2021-10-29 08:08:19
2 2021-11-09 03:43:58
3 2021-11-09 03:56:01

Solution with replace PST and PDT to -7 or -8 is:

df['PDT/PST'] = (pd.to_datetime(df['PDT/PST']
                                .replace({'PDT':'-07:00','PST':'-08:00'}, regex=True), utc=True)
                  .dt.tz_localize(None))
print (df)
              PDT/PST
0 2021-10-29 07:18:38
1 2021-10-29 08:08:19
2 2021-11-09 03:43:58
3 2021-11-09 03:56:01

CodePudding user response:

Another option: dateutil's parser with tzinfos supplied; then convert to UTC.

import dateutil
pacific_tz = dateutil.tz.gettz("US/Pacific")

df['UTC'] = df['PDT/PST'].apply(dateutil.parser.parse,
                                tzinfos={'PST': pacific_tz,
                                         'PDT': pacific_tz}).dt.tz_convert('UTC')

df['UTC']

0   2021-10-29 07:18:38 00:00
1   2021-10-29 08:08:19 00:00
2   2021-11-09 03:43:58 00:00
3   2021-11-09 03:56:01 00:00
Name: UTC, dtype: datetime64[ns, UTC]

Related: Python strptime() and timezones?

Now you could format to string with a certain format if desired, e.g.

df['UTC'].dt.strftime('%Y-%m-%d %H:%M:%S')

0    2021-10-29 07:18:38
1    2021-10-29 08:08:19
2    2021-11-09 03:43:58
3    2021-11-09 03:56:01
Name: UTC, dtype: object
  • Related