Home > Mobile >  Comparing timedelta column to hours
Comparing timedelta column to hours

Time:06-26

I want to get the time difference in the between below two columns. then compare the time difference hours to certain ranges. i.e if hours > 24 then x, if hours > 8 then y, if hours > 1 the z, and finally if hours <= 1 then done.

I have both columuns as below

df = pd.DataFrame({'Start': {0: '2022-05-01 14:32:07',
  1: '2022-05-01 13:09:20',
  2: '2022-05-01 17:19:12',
  3: '2022-05-01 16:12:03',
  4: '2022-05-01 13:24:42',
  5: '2022-05-01 02:24:17',
  6: '2022-05-01 12:53:35',
  7: '2022-05-01 09:57:34',
  8: '2022-05-01 17:24:29',
  9: '2022-05-01 09:58:15'},
 'End': {0: '2022-05-01 15:39:51',
  1: '2022-05-02 13:10:56',
  2: '2022-05-01 21:29:04',
  3: '2022-05-01 23:56:27',
  4: '2022-05-03 13:28:05',
  5: '2022-05-01 16:24:18',
  6: '2022-05-01 12:55:42',
  7: '2022-05-01 19:57:35',
  8: '2022-05-01 17:30:06',
  9: '2022-05-01 19:58:16'}})
df = df.apply(pd.to_datetime)  # Just as imported data already in datetime format

Output:


                Start               End
0   2022-05-01 14:32:07 2022-05-01 15:39:51
1   2022-05-01 13:09:20 2022-05-02 13:10:56
2   2022-05-01 17:19:12 2022-05-01 21:29:04
3   2022-05-01 16:12:03 2022-05-01 23:56:27
4   2022-05-01 13:24:42 2022-05-03 13:28:05
5   2022-05-01 02:24:17 2022-05-01 16:24:18
6   2022-05-01 12:53:35 2022-05-01 12:55:42
7   2022-05-01 09:57:34 2022-05-01 19:57:35
8   2022-05-01 17:24:29 2022-05-01 17:30:06
9   2022-05-01 09:58:15 2022-05-01 19:58:16

What I did so far:

df['Duration'] = df['End'] - df['Start']
df

output:
                Start                  End        Duration
0   2022-05-01 14:32:07 2022-05-01 15:39:51 0 days 01:07:44
1   2022-05-01 13:09:20 2022-05-02 13:10:56 1 days 00:01:36
2   2022-05-01 17:19:12 2022-05-01 21:29:04 0 days 04:09:52
3   2022-05-01 16:12:03 2022-05-01 23:56:27 0 days 07:44:24
4   2022-05-01 13:24:42 2022-05-03 13:28:05 2 days 00:03:23
5   2022-05-01 02:24:17 2022-05-01 16:24:18 0 days 14:00:01
6   2022-05-01 12:53:35 2022-05-01 12:55:42 0 days 00:02:07
7   2022-05-01 09:57:34 2022-05-01 19:57:35 0 days 10:00:01
8   2022-05-01 17:24:29 2022-05-01 17:30:06 0 days 00:05:37
9   2022-05-01 09:58:15 2022-05-01 19:58:16 0 days 10:00:01

I also tried to convert the Duration column to datetime (using: df['Duration'] = pd.to_datetime(df['Duration']) then try to compare but I still get errors.

CodePudding user response:

Here is one way to do it:

df = df.apply(pd.to_datetime)

# As suggested in the comments
delta = (df["End"] - df["Start"]).dt.total_seconds() // 3600

# You do not specify what x, y, z are supposed to be, so this is an example
df["Delta"] = (
    delta
    .mask(delta > 24, "> 24h")
    .mask((delta <= 24) & (delta > 8), "> 8h")
    .mask((delta <= 8) & (delta > 1), "> 1h")
    .mask((delta <= 1), "<= 1h")
)
print(df)
# Output
                Start                 End  Delta
0 2022-05-01 14:32:07 2022-05-01 15:39:51  <= 1h
1 2022-05-01 13:09:20 2022-05-02 13:10:56   > 8h
2 2022-05-01 17:19:12 2022-05-01 21:29:04   > 1h
3 2022-05-01 16:12:03 2022-05-01 23:56:27   > 1h
4 2022-05-01 13:24:42 2022-05-03 13:28:05  > 24h
5 2022-05-01 02:24:17 2022-05-01 16:24:18   > 8h
6 2022-05-01 12:53:35 2022-05-01 12:55:42  <= 1h
7 2022-05-01 09:57:34 2022-05-01 19:57:35   > 8h
8 2022-05-01 17:24:29 2022-05-01 17:30:06  <= 1h
9 2022-05-01 09:58:15 2022-05-01 19:58:16   > 8h
  • Related