I am looking to do something like in this thread. However, I only want to subtract the time
component of the two datetime
columns.
For eg., given this dataframe:
ts1 ts2
0 2018-07-25 11:14:00 2018-07-27 12:14:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00
The expected output for ts2 -ts1
time component only should give:
ts1 ts2 ts_delta
0 2018-07-25 11:14:00 2018-07-27 12:14:00 1:00:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00 -1:00:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00 -0:17:00
So, for row 0
: the time for ts2
is 12:14:00
, the time for ts1
is 11:14:00
. The expected output is just these two times subtracting (don't care about the days
). In this case:
12:14:00
- 11:14:00
= 1:00:00
.
How would I do this in one single line?
CodePudding user response:
I've tried to simulate your problem in my local environment. Apparently pandas.datetime64' types supporting add/subtract operations. You don't actually need to access
datetime` object to execute these operations.
I did my experiments as below;
import pandas as pd
df = pd.DataFrame({'a' : ['2018-07-25 11:14:00', '2018-08-26 11:15:00', '2018-07-29 11:17:00'],
'b' : ['2018-07-27 12:14:00', '2018-09-24 10:15:00', '2018-07-22 11:00:00'] })
df['a'] = pd.to_datetime(df['a'])
df['b'] = pd.to_datetime(df['b'])
df['d'] = df['b'] - df['a']
and df is like;
a b d
0 2018-07-25 11:14:00 2018-07-27 12:14:00 2 days 01:00:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00 28 days 23:00:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00 -8 days 23:43:00
CodePudding user response:
Try this, first strip time, then put time on same day, subtract, and take absolute value.
l = lambda x: pd.to_datetime("01-01-1900 " x)
df["ts_delta"] = (
df["ts2"].dt.time.astype(str).apply(l) - df["ts1"].dt.time.astype(str).apply(l)
).abs()
df
Output:
ts1 ts2 ts_delta
0 2018-07-25 11:14:00 2018-07-27 12:14:00 0 days 01:00:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00 0 days 01:00:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00 0 days 00:17:00
CodePudding user response:
You need to set both datetimes to a common date first.
One way is to use pandas.DateOffset
:
o = pd.DateOffset(day=1, month=1, year=2022) # the exact numbers don't matter
# reset dates
ts1 = df['ts1'].add(o)
ts2 = df['ts2'].add(o)
# subtract
df['ts_delta'] = ts2.sub(ts1)
As one-liner:
df['ts_delta'] = df['ts2'].add((o:=pd.DateOffset(day=1, month=1, year=2022))).sub(df['ts1'].add(o))
Other way using a difference between ts2-ts1 (with dates) and ts2-ts1 (dates only):
df['ts_delta'] = (df['ts2'].sub(df['ts1'])
-df['ts2'].dt.normalize().sub(df['ts1'].dt.normalize())
)
output:
ts1 ts2 ts_delta
0 2018-07-25 11:14:00 2018-07-27 12:14:00 0 days 01:00:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00 -1 days 23:00:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00 -1 days 23:43:00
NB. don't get confused by the -1 days 23:00:00
, this is actually the ways to represent -1hour
CodePudding user response:
Another option, without the need for a reference date. Subtract ts1's time component from ts2 as a timedelta, then convert the resulting datetime to a timedelta by subtracting ts2' date:
df["delta_time"] = (df["ts2"] - pd.to_timedelta(df["ts1"].dt.time.astype(str))) - df["ts2"].dt.floor("d")
df
ts1 ts2 delta_time
0 2018-07-25 11:14:00 2018-07-27 12:14:00 0 days 01:00:00
1 2018-08-26 11:15:00 2018-09-24 10:15:00 -1 days 23:00:00
2 2018-07-29 11:17:00 2018-07-22 11:00:00 -1 days 23:43:00
CodePudding user response:
Use df['col_name'].dt.time
to get time from date time column. Let's assume your dataframe name is df. Now, in your case
t1 = df['ts1'].dt.time
t2 = df['ts2'].dt.time
df['ts_delta'] = t2 - t1
For single line
df['ts_delta'] = df['ts2'].dt.time - df['ts1'].dt.time
I hope it will resolve your issue. Happy Coding!