Home > Software engineering >  Split time duration between start_time and endtime by minute In Snowflake
Split time duration between start_time and endtime by minute In Snowflake

Time:07-06

I have a table with following format.

USER_ID START_TIME END_TIME
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000

I need to split the duration between START_TIME and END_TIME by adding a one minute slots as a new column such that for each minute i have a unique row. The result I want to see like this:

USER_ID START_TIME END_TIME MINUTE_SLOT
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:54.000 2020-04-04 09:05:27.000
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:54.000 2020-04-04 09:06:27.000
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:54.000 2020-04-04 09:07:27.000
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:54.000 2020-04-04 09:08:27.000
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:54.000 2020-04-04 09:09:27.000

If anyone can tell me is this possible and how to do this in snowflake, that would be much appreciated, thanks!

CodePudding user response:

Assuming you want to keep the minute slots between the start_time and end_time and assuming that the end_time - start_time <= 24 hours:

with 
  t0 as
  (select
    'AAA001' as user_id,
    '2020-04-04 09:04:27.000' as start_time,
    '2020-04-04 09:08:34.000' as end_time,
    timediff("minutes",start_time, end_time) as min_range),
  t1 as
  (select row_number() over(order by 0) as i
   from table(generator(rowcount => 3600))) -- 3600 minutes in 24 hours
select
  t0.user_id,
  t0.start_time,
  t0.end_time,
  timeadd("minutes", i, t0.start_time) as minute_slot
from t1 cross join t0
where minute_slot <= end_time

Based on an idea at this answer

CodePudding user response:

With following data -

with data(USER_ID,START_TIME,END_TIME) as
(select * from values
('AAA001','2020-04-04 09:04:27.000'::timestamp,'2020-04-04 09:08:34.000'::timestamp),
('AAA002','2020-04-04 09:04:27.000'::timestamp,'2020-04-04 09:08:34.000'::timestamp)
),

First approach, as the one shared by @Dave, with qualify clause to limit (use large generator value) -

rn_cte as (
select row_number () over (partition by user_id order by 0) rn,user_id,
start_time,
end_time,
timeadd(minute,
rn,
start_time) new_time
from data,
table(generator(rowcount=>150))
qualify rn <= TIMESTAMPDIFF(minute, start_time,end_time)
)select user_id, start_time, end_time, new_time
from rn_cte;

Or, another one to get dummy rows and avoiding extra row generation due to limitation of giving constant value in generator clause.

select user_id,
start_time,
end_time,
timeadd(minute, index, start_time) minute_slot from data,
table(split_to_table(repeat(',',TIMESTAMPDIFF(minute, start_time,end_time)-1),','));

Both giving following output -

USER_ID START_TIME END_TIME NEW_TIME
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:05:27.000
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:06:27.000
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:07:27.000
AAA001 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:08:27.000
AAA002 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:05:27.000
AAA002 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:06:27.000
AAA002 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:07:27.000
AAA002 2020-04-04 09:04:27.000 2020-04-04 09:08:34.000 2020-04-04 09:08:27.000
  • Related