Let's say I have the following data set:
ID | Identifier | Admission_Date | Release_Date |
---|---|---|---|
234 | 2 | 5/1/22 | 5/5/22 |
234 | 1 | 4/25/22 | 4/30/22 |
234 | 2 | 4/20/22 | 4/24/22 |
234 | 2 | 4/15/22 | 4/18/22 |
789 | 1 | 7/15/22 | 7/19/22 |
789 | 2 | 7/8/22 | 7/14/22 |
789 | 2 | 7/1/22 | 7/5/22 |
321 | 2 | 6/1/21 | 6/3/21 |
321 | 2 | 5/27/21 | 5/31/21 |
321 | 1 | 5/20/21 | 5/26/21 |
321 | 2 | 5/15/21 | 5/19/21 |
321 | 2 | 5/6/21 | 5/10/21 |
I want all rows with identifier=1
. I also want rows that are either directly below or above rows with Identifier=1
- sorted by most recent to least recent.
There is always a row below rows with identifier=1
. There may or may not be a row above. If there is no row with identifier=1
for an ID
, then it will not be brought in with a prior step.
The resulting data set should be as follows:
ID | Identifier | Admission Date | Release Date |
---|---|---|---|
234 | 2 | 5/1/22 | 5/5/22 |
234 | 1 | 4/25/22 | 4/30/22 |
234 | 2 | 4/20/22 | 4/24/22 |
789 | 1 | 7/15/22 | 7/19/22 |
789 | 2 | 7/8/22 | 7/14/22 |
321 | 2 | 5/27/21 | 5/31/21 |
321 | 1 | 5/20/21 | 5/26/21 |
321 | 2 | 5/15/21 | 5/19/21 |
I am using DBeaver, which runs PostgreSQL.
CodePudding user response:
I admittedly don't know Postgres well so the following could possibly be optimised, however using a combination of lag and lead to obtain the previous and next dates (assuming Admission_date is the one to order by) you could try
with d as (
select *,
case when identifier = 1 then Lag(admission_date) over(partition by id order by Admission_Date desc) end pd,
case when identifier = 1 then Lead(admission_date) over(partition by id order by Admission_Date desc) end nd
from t
)
select id, Identifier, Admission_Date, Release_Date
from d
where identifier = 1
or exists (
select * from d d2
where d2.id = d.id
and (d.Admission_Date = pd or d.admission_date = nd)
)
order by Id, Admission_Date desc;
CodePudding user response:
One way:
SELECT (x.my_row).* -- decompose fields from row type
FROM (
SELECT identifier
, lag(t) OVER w AS t0 -- take whole row
, t AS t1
, lead(t) OVER w AS t2
FROM tbl t
WINDOW w AS (PARTITION BY id ORDER BY admission_date)
) sub
CROSS JOIN LATERAL (
VALUES (t0), (t1), (t2) -- pivot
) x(my_row)
WHERE sub.identifier = 1
AND (x.my_row).id IS NOT NULL; -- exclude rows with NULL ( = missing row)
db<>fiddle here
The query is designed to only make a single pass over the table. Uses some advanced SQL / Postgres features.
About LATERAL
:
About the VALUES
expression:
The manual about extracting fields from a composite type.
If there are many rows per id
, other solutions will be (much) faster - with proper index support. You did not specify ...