Get certain rows, plus rows before and after-CodePudding

Let's say I have the following data set:

ID	Identifier	Admission_Date	Release_Date
234	2	5/1/22	5/5/22
234	1	4/25/22	4/30/22
234	2	4/20/22	4/24/22
234	2	4/15/22	4/18/22
789	1	7/15/22	7/19/22
789	2	7/8/22	7/14/22
789	2	7/1/22	7/5/22
321	2	6/1/21	6/3/21
321	2	5/27/21	5/31/21
321	1	5/20/21	5/26/21
321	2	5/15/21	5/19/21
321	2	5/6/21	5/10/21

I want all rows with identifier=1. I also want rows that are either directly below or above rows with Identifier=1 - sorted by most recent to least recent.

There is always a row below rows with identifier=1. There may or may not be a row above. If there is no row with identifier=1 for an ID, then it will not be brought in with a prior step.

The resulting data set should be as follows:

ID	Identifier	Admission Date	Release Date
234	2	5/1/22	5/5/22
234	1	4/25/22	4/30/22
234	2	4/20/22	4/24/22
789	1	7/15/22	7/19/22
789	2	7/8/22	7/14/22
321	2	5/27/21	5/31/21
321	1	5/20/21	5/26/21
321	2	5/15/21	5/19/21

I am using DBeaver, which runs PostgreSQL.

CodePudding user response：

I admittedly don't know Postgres well so the following could possibly be optimised, however using a combination of lag and lead to obtain the previous and next dates (assuming Admission_date is the one to order by) you could try

with d as (
  select *, 
    case when identifier = 1 then Lag(admission_date) over(partition by id order by Admission_Date desc) end pd, 
    case when identifier = 1 then Lead(admission_date) over(partition by id order by Admission_Date desc) end nd
  from t
)
select id, Identifier, Admission_Date, Release_Date
from d
where identifier = 1
or exists (
  select * from d d2 
  where d2.id = d.id
    and (d.Admission_Date = pd or d.admission_date = nd)
)
order by Id, Admission_Date desc;

CodePudding user response：

One way:

SELECT (x.my_row).*  -- decompose fields from row type
FROM  (
   SELECT identifier
        , lag(t)  OVER w AS t0  -- take whole row
        , t              AS t1
        , lead(t) OVER w AS t2
   FROM   tbl t
   WINDOW w AS (PARTITION BY id ORDER BY admission_date)
   ) sub
CROSS JOIN LATERAL (
   VALUES (t0), (t1), (t2)  -- pivot
    ) x(my_row)
WHERE  sub.identifier = 1
AND    (x.my_row).id IS NOT NULL;  -- exclude rows with NULL ( = missing row)

db<>fiddle here

The query is designed to only make a single pass over the table. Uses some advanced SQL / Postgres features.

About LATERAL:

What is the difference between a LATERAL JOIN and a subquery in PostgreSQL?

About the VALUES expression:

Postgres: convert single row to multiple rows (unpivot)

The manual about extracting fields from a composite type.

If there are many rows per id, other solutions will be (much) faster - with proper index support. You did not specify ...