Home > OS >  pandas: how to restore a timezone for a column
pandas: how to restore a timezone for a column

Time:09-30

I am using pandas to resample a column with a certain frequency. For that, I seem to have to remove the timezone information. So, I do something like:

df['timestamp'] = pd.to_datetime(df['timestamp']).dt.tz_localize(None)
# Now resample
df = df.resample('1H', on='timestamp').mean()

Now, what I want to do is restore the old timezone information back into the timestamp column. How can I save and restore the old timezone information?

CodePudding user response:

You can get the timezone with Series.dt.tz, save it before clearing the timezone with dt.tz_localize(None). Restore the timezone with the saved timezone value afterwards. For example:

sr = pd.Series(pd.date_range('2021-09-29 00:00', periods = 6, freq = 'D',
                            tz = 'US/Central'))

df = pd.DataFrame({'timestamp': sr, 'val': range(1, 7)})

timezone = df['timestamp'].dt.tz       # get and save time zone

df['timestamp'] = pd.to_datetime(df['timestamp']).dt.tz_localize(None)
# Now resample
df = df.resample('1H', on='timestamp').mean()

df.index = pd.to_datetime(df.index).tz_localize(timezone)      # restore the timezone onto timestamp (now in index)

print(df.index)

### timezone restored
DatetimeIndex(['2021-09-29 00:00:00-05:00', '2021-09-29 01:00:00-05:00',
               '2021-09-29 02:00:00-05:00', '2021-09-29 03:00:00-05:00',
               '2021-09-29 04:00:00-05:00', '2021-09-29 05:00:00-05:00',
               '2021-09-29 06:00:00-05:00', '2021-09-29 07:00:00-05:00',
               '2021-09-29 08:00:00-05:00', '2021-09-29 09:00:00-05:00',
               ...
               '2021-10-03 15:00:00-05:00', '2021-10-03 16:00:00-05:00',
               '2021-10-03 17:00:00-05:00', '2021-10-03 18:00:00-05:00',
               '2021-10-03 19:00:00-05:00', '2021-10-03 20:00:00-05:00',
               '2021-10-03 21:00:00-05:00', '2021-10-03 22:00:00-05:00',
               '2021-10-03 23:00:00-05:00', '2021-10-04 00:00:00-05:00'],
              dtype='datetime64[ns, US/Central]', name='timestamp', length=121, freq=None)
  • Related