Split time duration between start_time and endtime by minute In Snowflake-CodePudding

I have a table with following format.

USER_ID	START_TIME	END_TIME
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000

I need to split the duration between START_TIME and END_TIME by adding a one minute slots as a new column such that for each minute i have a unique row. The result I want to see like this:

USER_ID	START_TIME	END_TIME	MINUTE_SLOT
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:54.000	2020-04-04 09:05:27.000
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:54.000	2020-04-04 09:06:27.000
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:54.000	2020-04-04 09:07:27.000
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:54.000	2020-04-04 09:08:27.000
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:54.000	2020-04-04 09:09:27.000

If anyone can tell me is this possible and how to do this in snowflake, that would be much appreciated, thanks!

CodePudding user response：

Assuming you want to keep the minute slots between the start_time and end_time and assuming that the end_time - start_time <= 24 hours:

with 
  t0 as
  (select
    'AAA001' as user_id,
    '2020-04-04 09:04:27.000' as start_time,
    '2020-04-04 09:08:34.000' as end_time,
    timediff("minutes",start_time, end_time) as min_range),
  t1 as
  (select row_number() over(order by 0) as i
   from table(generator(rowcount => 3600))) -- 3600 minutes in 24 hours
select
  t0.user_id,
  t0.start_time,
  t0.end_time,
  timeadd("minutes", i, t0.start_time) as minute_slot
from t1 cross join t0
where minute_slot <= end_time

Based on an idea at this answer

CodePudding user response：

With following data -

with data(USER_ID,START_TIME,END_TIME) as
(select * from values
('AAA001','2020-04-04 09:04:27.000'::timestamp,'2020-04-04 09:08:34.000'::timestamp),
('AAA002','2020-04-04 09:04:27.000'::timestamp,'2020-04-04 09:08:34.000'::timestamp)
),

First approach, as the one shared by @Dave, with qualify clause to limit (use large generator value) -

rn_cte as (
select row_number () over (partition by user_id order by 0) rn,user_id,
start_time,
end_time,
timeadd(minute,
rn,
start_time) new_time
from data,
table(generator(rowcount=>150))
qualify rn <= TIMESTAMPDIFF(minute, start_time,end_time)
)select user_id, start_time, end_time, new_time
from rn_cte;

Or, another one to get dummy rows and avoiding extra row generation due to limitation of giving constant value in generator clause.

select user_id,
start_time,
end_time,
timeadd(minute, index, start_time) minute_slot from data,
table(split_to_table(repeat(',',TIMESTAMPDIFF(minute, start_time,end_time)-1),','));

Both giving following output -

USER_ID	START_TIME	END_TIME	NEW_TIME
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:05:27.000
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:06:27.000
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:07:27.000
AAA001	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:08:27.000
AAA002	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:05:27.000
AAA002	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:06:27.000
AAA002	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:07:27.000
AAA002	2020-04-04 09:04:27.000	2020-04-04 09:08:34.000	2020-04-04 09:08:27.000