I have a table with following format.
USER_ID | START_TIME | END_TIME |
---|---|---|
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 |
I need to split the duration between START_TIME and END_TIME by adding a one minute slots as a new column such that for each minute i have a unique row. The result I want to see like this:
USER_ID | START_TIME | END_TIME | MINUTE_SLOT |
---|---|---|---|
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:54.000 | 2020-04-04 09:05:27.000 |
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:54.000 | 2020-04-04 09:06:27.000 |
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:54.000 | 2020-04-04 09:07:27.000 |
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:54.000 | 2020-04-04 09:08:27.000 |
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:54.000 | 2020-04-04 09:09:27.000 |
If anyone can tell me is this possible and how to do this in snowflake, that would be much appreciated, thanks!
CodePudding user response:
Assuming you want to keep the minute slots between the start_time and end_time and assuming that the end_time - start_time <= 24 hours:
with
t0 as
(select
'AAA001' as user_id,
'2020-04-04 09:04:27.000' as start_time,
'2020-04-04 09:08:34.000' as end_time,
timediff("minutes",start_time, end_time) as min_range),
t1 as
(select row_number() over(order by 0) as i
from table(generator(rowcount => 3600))) -- 3600 minutes in 24 hours
select
t0.user_id,
t0.start_time,
t0.end_time,
timeadd("minutes", i, t0.start_time) as minute_slot
from t1 cross join t0
where minute_slot <= end_time
Based on an idea at this answer
CodePudding user response:
With following data -
with data(USER_ID,START_TIME,END_TIME) as
(select * from values
('AAA001','2020-04-04 09:04:27.000'::timestamp,'2020-04-04 09:08:34.000'::timestamp),
('AAA002','2020-04-04 09:04:27.000'::timestamp,'2020-04-04 09:08:34.000'::timestamp)
),
First approach, as the one shared by @Dave, with qualify clause to limit (use large generator value) -
rn_cte as (
select row_number () over (partition by user_id order by 0) rn,user_id,
start_time,
end_time,
timeadd(minute,
rn,
start_time) new_time
from data,
table(generator(rowcount=>150))
qualify rn <= TIMESTAMPDIFF(minute, start_time,end_time)
)select user_id, start_time, end_time, new_time
from rn_cte;
Or, another one to get dummy rows and avoiding extra row generation due to limitation of giving constant value in generator
clause.
select user_id,
start_time,
end_time,
timeadd(minute, index, start_time) minute_slot from data,
table(split_to_table(repeat(',',TIMESTAMPDIFF(minute, start_time,end_time)-1),','));
Both giving following output -
USER_ID | START_TIME | END_TIME | NEW_TIME |
---|---|---|---|
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:05:27.000 |
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:06:27.000 |
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:07:27.000 |
AAA001 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:08:27.000 |
AAA002 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:05:27.000 |
AAA002 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:06:27.000 |
AAA002 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:07:27.000 |
AAA002 | 2020-04-04 09:04:27.000 | 2020-04-04 09:08:34.000 | 2020-04-04 09:08:27.000 |