I am trying to subtract values of two datetime objects like the following:
df["Time Taken"] = (pd.to_datetime(df['end_time']).dt.tz_convert('Asia/Kolkata').dt.time) - \
(pd.to_datetime(df['start_time']).dt.tz_convert('Asia/Kolkata').dt.time)
in a Dataframe like the following:
id serial reference_number date warehouse owner start_time end_time
0 352 655556555 5002 2022-05-15T13:33:00.208423Z Delhivery Goa Warehouse Delhivery_Goa 2022-05-31T11:26:10.977240Z 2022-05-31T11:59:55.421159Z
1 362 E2806995000040043136895A 5007 2022-05-18T10:37:28.871625Z Delhivery Goa Warehouse Delhivery_Goa 2022-05-31T11:26:10.977240Z 2022-05-31T11:59:55.421159Z
2 363 E28069950000500431368519 5008 2022-05-18T10:44:32.398842Z Delhivery Goa Warehouse Delhivery_Goa 2022-05-31T11:26:10.977240Z 2022-05-31T11:59:55.421159Z
3 364 E28069950000400431368536 5008 2022-05-18T10:44:32.398842Z Delhivery Goa Warehouse Delhivery_Goa 2022-05-31T11:26:10.977240Z 2022-05-31T11:59:55.421159Z
4 370 E28069950000500431368549 5010 2022-05-18T12:30:34.599759Z Delhivery Goa Warehouse Delhivery_Goa 2022-05-31T11:26:10.977240Z 2022-05-31T11:59:55.421159Z
I tried timedelta
as well but the I got the following error:
only leading negative signs are allowed
How do I do it?
CodePudding user response:
First subtract columns converted to datetimes by Series.sub
, for seconds use Series.dt.total_seconds
::
df["Time Taken"] = (pd.to_datetime(df['end_time']).sub(pd.to_datetime(df['start_time']))
.dt.total_seconds())
For format HH:MM:SS
from timedeltas use:
def f(x):
ts = x.total_seconds()
hours, remainder = divmod(ts, 3600)
minutes, seconds = divmod(remainder, 60)
return ('{}:{:02d}:{:02d}').format(int(hours), int(minutes), int(seconds))
df["Time Taken"] = (pd.to_datetime(df['end_time']).sub(pd.to_datetime(df['start_time']))
.apply(f))
CodePudding user response:
You're making this way more complicated than it needs to be:
Just
df["Time Taken"] = (
pd.to_datetime(df['end_time'])
- pd.to_datetime(df['start_time'])
)
This should give you a column of timedeltas which you can format as you wish.
Steps you can skip:
- Converting the time zone. The times in the database already specify specific points in time, so the difference can be calculated. Converting from UTC to local doesn't change the result.
- Extracting the time from the date with
.dt.time
. This is what causes the error because times don't support subtraction. Subtraction isn't implemented for these because the concept of a difference between two times isn't well-defined. Like is (1am - 4am) -3 hours or 21 hours or even -2 or -4 hours because a switch between daylight savings and normal time occured?