Home > Blockchain >  Filter rows on dates depending on different column values
Filter rows on dates depending on different column values

Time:05-25

I've been trying to filter rows with a certain value for code based on the values of task_dts. Essentially I only want the rows per id where the task_dts timestamp falls between the code_dts timestamp of the current code and the code_dts of the following code.

For example; for the rows where code equals 'z', I only want the rows where the task_dts is in the range of the code_dts for value 'z' and value 'y'. For the rows where the code equals 'y', I only want the rows where the task_dts is in the range of the code_dts for value 'y' and value 'x' etc

My table looks the following:

rowid id code code_dts task task_dts
1 a z 2022-02-01 10:17:08.403000 1 2022-02-01 10:21:27.000000
2 a z 2022-02-01 10:17:08.403000 2 2022-02-01 10:21:31.000000
3 a z 2022-02-01 10:17:08.403000 3 2022-02-01 12:41:43.000000
4 a y 2022-02-01 11:12:13.270000 1 2022-02-01 10:21:27.000000
5 a y 2022-02-01 11:12:13.270000 3 2022-02-01 12:41:43.000000
6 a y 2022-02-01 11:12:13.270000 8 2022-02-21 14:57:53.000000
7 a x 2022-02-21 12:28:50.647000 6 2022-02-21 14:57:53.000000
8 a x 2022-02-21 12:28:50.647000 7 2022-02-21 14:57:54.000000
9 b h 2022-04-05 13:44:16.030000 1 2022-04-05 14:03:56.570000
10 b h 2022-04-05 13:44:16.030000 2 2022-04-05 14:03:56.570000
11 b i 2022-04-06 13:44:16.030000 1 2022-04-05 14:03:56.570000
12 b j 2022-04-07 13:44:16.030000 3 2022-04-05 14:03:56.570000

The output would look like this:

rowid id code code_dts task task_dts
1 a z 2022-02-01 10:17:08.403000 1 2022-02-01 10:21:27.000000
2 a z 2022-02-01 10:17:08.403000 2 2022-02-01 10:21:31.000000
5 a y 2022-02-01 11:12:13.270000 3 2022-02-01 12:41:43.000000
7 a x 2022-02-21 12:28:50.647000 6 2022-02-21 14:57:53.000000
8 a x 2022-02-21 12:28:50.647000 7 2022-02-21 14:57:54.000000
10 b h 2022-04-05 13:44:16.030000 2 2022-04-05 14:03:56.570000
11 b i 2022-04-06 13:44:16.030000 1 2022-04-05 14:03:56.570000
12 b j 2022-04-07 13:44:16.030000 3 2022-04-05 14:03:56.570000

I've tried to solve this with qualify without succes. help would be much appreciated

CodePudding user response:

You can use a table expression to pre-compute the timestamp ranges. Then, filtering is easy.

For example:

select t.*
from t
join (
  select code, dt, lead(dt) over(order by dt) as next_dt
  from (select code, min(code_dts) as dt from t group by code) x
) y on t.code = y.code
where t.task_dts between y.dt and y.next_dt or y.next_dt is null

CodePudding user response:

After reading The Impaler's answer I finally understood your requirements :-)

This is the same logic based on Window Functions:

with cte as
 ( 
   select t.*
     -- next code_dts, i.e. at least one row will return 
     -- the code_dts of the following code 
     ,lead(code_dts,1,task_dts) over (order by code_dts) as next_dts
   from tab as t
 )
select *
from cte
qualify task_dts between code_dts
                     -- assign the next code's dts to all rows within the same code
                     and max(next_dts) over (partition by code)
; 

Hard to tell which one will perfrom better...

  • Related