Home > database >  MySQL calculate duration by date till end of date using lead function
MySQL calculate duration by date till end of date using lead function

Time:05-22

I have a table that stores the records for users' different sessions(subscribe, unsubscribe, away, online). I am able to calculate the time duration for each session using the following given query. There is a scenario that supposes a user starts his online session on "15-May-2022 at 11:00:00 PM", after that on the next day he set himself away on "16-May-2022 at 02:00:00 AM" Total online is 3 Hours which I am getting as the last row on the date 15-May-2022.

But I need like this On 15-May till "15-May-2022 23:59:59" it was online for 1 Hour and on 16-May from "16-May-2022 00:00:00 to 16-May-2022 at 02:00:00 AM", it was online for 2 Hours. So in response, it should return 1 hour for 15-May and 2 hours for 16-May, not a total of 3 hours on 15-May.

I am using the lead function to get the duration from the created_at column, is there any way in which I can restrict the lead function to calculate duration till the next created_at till 23:59:59.

Here is my working query. I am using the latest MySQL(8) version.

select `id`, `user_id`, `status`, `created_at`,
 SEC_TO_TIME(TIMESTAMPDIFF(SECOND, created_at,
LEAD(created_at) OVER (PARTITION BY user_id ORDER BY created_at))) as duration,
 date(created_at) as date from `user_websocket_events` as `all_status`
 where created_at between '2022-05-15 00:00:00' and '2022-05-16 23:59:59' and `status` is not null
 and user_id in (69) order by `id` asc;

Here is some sample data.

INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10816, 69, 'subscribe', 'online', null, '2022-05-15 12:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10817, 69, 'away', 'away', null, '2022-05-15 20:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10818, 69, 'online', 'online', null, '2022-05-15 22:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10819, 69, 'away', 'away', null, '2022-05-16 02:57:31', '2022-05-14 10:57:37');
INSERT INTO user_websocket_events (id, user_id, event, status, extra_attributes, created_at, updated_at) VALUES (10820, 69, 'unsubscribe', 'unsubscribe', null, '2022-05-16 03:57:31', '2022-05-14 10:57:37');

CodePudding user response:

Using an on-the-fly calendar table to split a session by days

with recursive calendar as (
      select timestamp('2022-05-01 00:00') start_time, timestamp('2022-05-01 23:59:59')  end_time, 1 id 
      union all
      select start_time   interval 1 day, end_time   interval 1 day, id 1
      from calendar
      where id < 100
)
select e.id,  e.status, date(greatest(c.start_time, e.created_at)) date,
   greatest(c.start_time, e.created_at) as created_at,
   least(c.end_time, e.ended_at) as ended_at
from (
   select `id`, `user_id`, `status`, `created_at`,
     -- a session end is restricted to the end of the requierd interval 
     LEAD(created_at, 1, '2022-05-16 23:59:59') OVER (PARTITION BY user_id ORDER BY created_at) as ended_at
   from `user_websocket_events` 
   where 
     -- only sessions started within the requierd interval 
     created_at between '2022-05-15 00:00:00' and '2022-05-16 23:59:59' and `status` is not null
     and user_id in (69) 
   ) e
join calendar c on c.start_time < e.ended_at and e.created_at < c.end_time
order by id;

db<>fiddle

CodePudding user response:

You need to handle it using CASE statement.

Now instead of the actual created_at and LEAD(created_at), we need something like below.

First Case:

If the created_at and LEAD(created_at) falls on different date, then consider date(created_at) '23:59:59' else consider created_at as ENDTIME.

CASE
                  WHEN Date(lead_created_at)=Date(created_at) THEN lead_created_at
                  ELSE Addtime(Timestamp(Date(created_at)),'23:59:59')
END

Second Case:

If the created_at and LAG(created_at) falls on different date, then consider date(created_at) '00:00:00' else consider created_at as STARTTIME.

CASE
                  WHEN Date(lag_created_at)=Date(created_at) THEN created_at
                  ELSE Timestamp(Date(created_at))
END

Finally, the query can be written as below to get the desired output.

SELECT   `id`,
         `user_id`,
         `status`,
         `created_at`,
         CASE
                  WHEN Date(lag_created_at)=Date(created_at) THEN created_at
                  ELSE Timestamp(Date(created_at))
         end new_starttime,
         CASE
                  WHEN Date(lead_created_at)=Date(created_at) THEN lead_created_at
                  WHEN lead_created_at is null then null
                  ELSE Addtime(Timestamp(Date(created_at)),'23:59:59')
         end AS new_endtime,
         Sec_to_time(Timestampdiff(second,
         CASE
                  WHEN Date(lag_created_at)=Date(created_at) THEN created_at
                  ELSE Timestamp(Date(created_at))
         end,
         CASE
                  WHEN Date(lead_created_at)=Date(created_at) THEN lead_created_at
                  WHEN lead_created_at is null then null
                  ELSE Addtime(Timestamp(Date(created_at)),'23:59:59')
         end )) AS duration,
         date
FROM     (
                  SELECT   `id`,
                           `user_id`,
                           `status`,
                           `created_at`,
                           (Lead(created_at) over (partition BY user_id ORDER BY created_at))                   AS lead_created_at,
                           coalesce(lag(created_at) over (partition BY user_id ORDER BY created_at),created_at) AS lag_created_at,
                           date(created_at)                                                                     AS date
                  FROM     `user_websocket_events`                                                              AS `all_status`
                  WHERE    created_at BETWEEN '2022-05-15 00:00:00' AND      '2022-05-16 23:59:59'
                  AND      `status` IS NOT NULL
                  AND      user_id IN (69) )tmp
ORDER BY `id` ASC;

Resultset:

id user_id status created_at new_starttime new_endtime duration date
10816 69 online 2022-05-15 12:57:31 2022-05-15 12:57:31 2022-05-15 20:57:31 08:00:00 2022-05-15
10817 69 away 2022-05-15 20:57:31 2022-05-15 20:57:31 2022-05-15 22:57:31 02:00:00 2022-05-15
10818 69 online 2022-05-15 22:57:31 2022-05-15 22:57:31 2022-05-15 23:59:59 01:02:28 2022-05-15
10819 69 away 2022-05-16 02:57:31 2022-05-16 00:00:00 2022-05-16 03:57:31 03:57:31 2022-05-16
10820 69 unsubscribe 2022-05-16 03:57:31 2022-05-16 03:57:31 null null 2022-05-16

Note: The query will not handle the scenario where sessions last more than 48 hour.

DB Fiddle: Try it here

  • Related