Home > Back-end >  SQL query to populate missing values using lead and lag
SQL query to populate missing values using lead and lag

Time:07-08

I'm trying to create a new column that fills in the nulls below. I tried using leads and lags but isn't turning out right. Basically trying to figure out who is in "possession" of the record, given the TransferFrom and TransferTo columns and sequence of events. For instance, 35 was in possession of the record until they transferred it to 57. My current query will only populate 35 in record 2 since it only leads by one record. I need the query to populate all records prior to this as well if 35 is the first value found in the TransferFrom column. SQL Server...Any ideas?

Create table script:

CREATE TABLE results (
 OrderID int
,TransferFrom varchar(20)
,TransferTo varchar(20)
,ActionTime datetime)

INSERT INTO results

VALUES
(1,null,null,'2020-01-01 00:00:00'),
(1,null,null,'2020-01-02 00:00:00'),
(1,null,null,'2020-01-03 00:00:00'),
(1,'35','57','2020-01-04 00:00:00'),
(1,null,null,'2020-01-05 00:00:00'),
(1,null,null,'2020-01-06 00:00:00'),
(1,'57','45','2020-01-07 00:00:00'),
(1,null,null,'2020-01-08 00:00:00'),
(1,null,null,'2020-01-09 00:00:00'),
(1,null,null,'2020-01-10 00:00:00')

Current query that doesn't work:

SELECT *
     ,coalesce(
        lead(TransferFrom) over (partition by OrderID order by ActionTime)
       ,TransferFrom
       ,lag(TransferTo) over (partition by OrderID order by ActionTime)) as NewColumn
FROM results

Current query result that is incorrect:

enter image description here

Desired query result:

enter image description here

CodePudding user response:

Kind of a funky situation but this works for your sample data. Ideally it would be better to fix the process that is not inserting values consistently so you don't have to jump through these hoops.

select r.*
    , NewColumn = coalesce(x.TransferFrom, y.TransferTo)
from results r
outer apply
(
    select top 1 TransferFrom
    from results r2 
    where r2.ActionTime >= r.ActionTime
        and r2.TransferFrom is not null
    order by r2.ActionTime
) x --this will get the values for all the rows that have a preceeding NULL
outer apply
(
    select top 1 TransferTo
    from results r3 
    where r3.ActionTime <= r.ActionTime
        and r3.TransferTo is not null
    order by r3.ActionTime desc
) y --this will get the values for the last rows that don't have a value.
order by r.ActionTime
  • Related