Home > Software engineering >  Get certain rows, plus rows before and after
Get certain rows, plus rows before and after

Time:08-24

Let's say I have the following data set:

ID Identifier Admission_Date Release_Date
234 2 5/1/22 5/5/22
234 1 4/25/22 4/30/22
234 2 4/20/22 4/24/22
234 2 4/15/22 4/18/22
789 1 7/15/22 7/19/22
789 2 7/8/22 7/14/22
789 2 7/1/22 7/5/22
321 2 6/1/21 6/3/21
321 2 5/27/21 5/31/21
321 1 5/20/21 5/26/21
321 2 5/15/21 5/19/21
321 2 5/6/21 5/10/21

I want all rows with identifier=1. I also want rows that are either directly below or above rows with Identifier=1 - sorted by most recent to least recent.

There is always a row below rows with identifier=1. There may or may not be a row above. If there is no row with identifier=1 for an ID, then it will not be brought in with a prior step.

The resulting data set should be as follows:

ID Identifier Admission Date Release Date
234 2 5/1/22 5/5/22
234 1 4/25/22 4/30/22
234 2 4/20/22 4/24/22
789 1 7/15/22 7/19/22
789 2 7/8/22 7/14/22
321 2 5/27/21 5/31/21
321 1 5/20/21 5/26/21
321 2 5/15/21 5/19/21

I am using DBeaver, which runs PostgreSQL.

CodePudding user response:

I admittedly don't know Postgres well so the following could possibly be optimised, however using a combination of lag and lead to obtain the previous and next dates (assuming Admission_date is the one to order by) you could try

with d as (
  select *, 
    case when identifier = 1 then Lag(admission_date) over(partition by id order by Admission_Date desc) end pd, 
    case when identifier = 1 then Lead(admission_date) over(partition by id order by Admission_Date desc) end nd
  from t
)
select id, Identifier, Admission_Date, Release_Date
from d
where identifier = 1
or exists (
  select * from d d2 
  where d2.id = d.id
    and (d.Admission_Date = pd or d.admission_date = nd)
)
order by Id, Admission_Date desc;

CodePudding user response:

One way:

SELECT (x.my_row).*  -- decompose fields from row type
FROM  (
   SELECT identifier
        , lag(t)  OVER w AS t0  -- take whole row
        , t              AS t1
        , lead(t) OVER w AS t2
   FROM   tbl t
   WINDOW w AS (PARTITION BY id ORDER BY admission_date)
   ) sub
CROSS JOIN LATERAL (
   VALUES (t0), (t1), (t2)  -- pivot
    ) x(my_row)
WHERE  sub.identifier = 1
AND    (x.my_row).id IS NOT NULL;  -- exclude rows with NULL ( = missing row)

db<>fiddle here

The query is designed to only make a single pass over the table. Uses some advanced SQL / Postgres features.

About LATERAL:

About the VALUES expression:

The manual about extracting fields from a composite type.

If there are many rows per id, other solutions will be (much) faster - with proper index support. You did not specify ...

  • Related