I've been trying to filter rows with a certain value for code
based on the values of task_dts.
Essentially I only want the rows per id where the task_dts
timestamp falls between the code_dts
timestamp of the current code
and the code_dts
of the following code
.
For example; for the rows where code
equals 'z', I only want the rows where the task_dts
is in the range of the code_dts
for value 'z' and value 'y'.
For the rows where the code
equals 'y', I only want the rows where the task_dts
is in the range of the code_dts
for value 'y' and value 'x' etc
My table looks the following:
rowid | id | code | code_dts | task | task_dts |
---|---|---|---|---|---|
1 | a | z | 2022-02-01 10:17:08.403000 | 1 | 2022-02-01 10:21:27.000000 |
2 | a | z | 2022-02-01 10:17:08.403000 | 2 | 2022-02-01 10:21:31.000000 |
3 | a | z | 2022-02-01 10:17:08.403000 | 3 | 2022-02-01 12:41:43.000000 |
4 | a | y | 2022-02-01 11:12:13.270000 | 1 | 2022-02-01 10:21:27.000000 |
5 | a | y | 2022-02-01 11:12:13.270000 | 3 | 2022-02-01 12:41:43.000000 |
6 | a | y | 2022-02-01 11:12:13.270000 | 8 | 2022-02-21 14:57:53.000000 |
7 | a | x | 2022-02-21 12:28:50.647000 | 6 | 2022-02-21 14:57:53.000000 |
8 | a | x | 2022-02-21 12:28:50.647000 | 7 | 2022-02-21 14:57:54.000000 |
9 | b | h | 2022-04-05 13:44:16.030000 | 1 | 2022-04-05 14:03:56.570000 |
10 | b | h | 2022-04-05 13:44:16.030000 | 2 | 2022-04-05 14:03:56.570000 |
11 | b | i | 2022-04-06 13:44:16.030000 | 1 | 2022-04-05 14:03:56.570000 |
12 | b | j | 2022-04-07 13:44:16.030000 | 3 | 2022-04-05 14:03:56.570000 |
The output would look like this:
rowid | id | code | code_dts | task | task_dts |
---|---|---|---|---|---|
1 | a | z | 2022-02-01 10:17:08.403000 | 1 | 2022-02-01 10:21:27.000000 |
2 | a | z | 2022-02-01 10:17:08.403000 | 2 | 2022-02-01 10:21:31.000000 |
5 | a | y | 2022-02-01 11:12:13.270000 | 3 | 2022-02-01 12:41:43.000000 |
7 | a | x | 2022-02-21 12:28:50.647000 | 6 | 2022-02-21 14:57:53.000000 |
8 | a | x | 2022-02-21 12:28:50.647000 | 7 | 2022-02-21 14:57:54.000000 |
10 | b | h | 2022-04-05 13:44:16.030000 | 2 | 2022-04-05 14:03:56.570000 |
11 | b | i | 2022-04-06 13:44:16.030000 | 1 | 2022-04-05 14:03:56.570000 |
12 | b | j | 2022-04-07 13:44:16.030000 | 3 | 2022-04-05 14:03:56.570000 |
I've tried to solve this with qualify without succes. help would be much appreciated
CodePudding user response:
You can use a table expression to pre-compute the timestamp ranges. Then, filtering is easy.
For example:
select t.*
from t
join (
select code, dt, lead(dt) over(order by dt) as next_dt
from (select code, min(code_dts) as dt from t group by code) x
) y on t.code = y.code
where t.task_dts between y.dt and y.next_dt or y.next_dt is null
CodePudding user response:
After reading The Impaler's answer I finally understood your requirements :-)
This is the same logic based on Window Functions:
with cte as
(
select t.*
-- next code_dts, i.e. at least one row will return
-- the code_dts of the following code
,lead(code_dts,1,task_dts) over (order by code_dts) as next_dts
from tab as t
)
select *
from cte
qualify task_dts between code_dts
-- assign the next code's dts to all rows within the same code
and max(next_dts) over (partition by code)
;
Hard to tell which one will perfrom better...