I am using pandas to resample a column with a certain frequency. For that, I seem to have to remove the timezone information. So, I do something like:
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.tz_localize(None)
# Now resample
df = df.resample('1H', on='timestamp').mean()
Now, what I want to do is restore the old timezone information back into the timestamp
column. How can I save and restore the old timezone information?
CodePudding user response:
You can get the timezone with Series.dt.tz
, save it before clearing the timezone with dt.tz_localize(None)
. Restore the timezone with the saved timezone value afterwards. For example:
sr = pd.Series(pd.date_range('2021-09-29 00:00', periods = 6, freq = 'D',
tz = 'US/Central'))
df = pd.DataFrame({'timestamp': sr, 'val': range(1, 7)})
timezone = df['timestamp'].dt.tz # get and save time zone
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.tz_localize(None)
# Now resample
df = df.resample('1H', on='timestamp').mean()
df.index = pd.to_datetime(df.index).tz_localize(timezone) # restore the timezone onto timestamp (now in index)
print(df.index)
### timezone restored
DatetimeIndex(['2021-09-29 00:00:00-05:00', '2021-09-29 01:00:00-05:00',
'2021-09-29 02:00:00-05:00', '2021-09-29 03:00:00-05:00',
'2021-09-29 04:00:00-05:00', '2021-09-29 05:00:00-05:00',
'2021-09-29 06:00:00-05:00', '2021-09-29 07:00:00-05:00',
'2021-09-29 08:00:00-05:00', '2021-09-29 09:00:00-05:00',
...
'2021-10-03 15:00:00-05:00', '2021-10-03 16:00:00-05:00',
'2021-10-03 17:00:00-05:00', '2021-10-03 18:00:00-05:00',
'2021-10-03 19:00:00-05:00', '2021-10-03 20:00:00-05:00',
'2021-10-03 21:00:00-05:00', '2021-10-03 22:00:00-05:00',
'2021-10-03 23:00:00-05:00', '2021-10-04 00:00:00-05:00'],
dtype='datetime64[ns, US/Central]', name='timestamp', length=121, freq=None)