I have two datetime objects which I want to subtract - however they both need to be in same format
I tried to convert datetime64[ns, pytz.FixedOffset(-240)] (eastern time zone) however I run into errors. Other datetime object is datetime64[ns] which is already in est timezone
1) df['date'].strftime('%Y-%m-%d %H:%M:%S')
error: 'Series' object has no attribute 'strftime'
2) df['date'].replace(tzinfo=None)
error: replace() got an unexpected keyword argument 'tzinfo'
3) df['date'].dt_tz.replace(tzinfo=None)
error: 'Series' object has no attribute 'dt_tz'
CodePudding user response:
In pandas, if you have mixed time zones or UTC offsets, you will get
TypeError: DatetimeArray subtraction must have the same timezones or no timezones
when trying to calculate a timedelta. The error basically tells you how to avoid it: convert everything to the same tz, for example:
import pandas as pd
df = pd.DataFrame({
'date0': pd.to_datetime(["2021-08-01 00:00 -04:00"]), # should be US/Eastern
'date1': pd.to_datetime(["2021-08-01 01:00"]) # should be US/Eastern as well
})
# date0 date1
# 0 2021-08-01 00:00:00-04:00 2021-08-01 01:00:00
# date0 already has a UTC offset but we can set a proper time zone:
df['date0'] = df['date0'].dt.tz_convert('America/New_York')
# date1 is naive, i.e. does not have a time zone, so we need to localize:
df['date1'] = df['date1'].dt.tz_localize('America/New_York')
# since both datetime columns now have the same time zone, we can calculate:
print(df['date1'] - df['date0'])
# 0 0 days 01:00:00
# dtype: timedelta64[ns]
Python's datetime isn't that picky, you can easily calculate timedelta from datetime objects with different time zones:
from datetime import datetime
from zoneinfo import ZoneInfo # Python 3.9
d0 = datetime(2021, 1, 1, tzinfo=ZoneInfo("UTC"))
d1 = datetime(2020, 12, 31, 20, tzinfo=ZoneInfo('America/New_York'))
print(d1-d0)
# 1:00:00
Keep in mind that Python's timedelta arithmetic is wall-time arithmetic; you can do weird stuff like this. So it's sometimes less obvious what's going on I'd say.
CodePudding user response:
While @MrFuppes answer is detailed for generic case since one of my dataframe was already in tz format I had to take below steps which worked
Initial format
datetime64[ns, pytz.FixedOffset(-240)] (eastern time zone)
1) Step taken
pd.to_datetime((df['date']).dt.tz_convert('US/Eastern'))
Initial Format
datetime64[ns]
2) Step taken
pd.to_datetime((df['date1']).dt.tz_localize('US/Eastern'))
This two steps brought datetime in same format for me to perform arithmetic operations