Home > other >  Get lag value from the last row where value from other column = 'n'
Get lag value from the last row where value from other column = 'n'

Time:11-16

My data is as follows

enter image description here

when flag = 'y' I want col1 to be updated with the last flag = 'n' row. I want the col1 values in row num 5 & 6 to be updated with the last row where flag = n, i.e row 4 & similarly row 10,11 should be updated with row 9.

the data is sored by date in ascending order. I was able to use

I was able to use lag function where only row num 5 gets updated with row num 4 value, but row 6 takes row 5 value which is incorrect.

SELECT ID, col1, flag, date, 
CASE WHEN flag = 'y' THEN LAG(col1, 1) OVER (PARTITION BY ID ORDER BY date) 
     WHEN flag = 'n' THEN col1 END as col1_new
FROM my_table;

Could you please help me solve this so that - col1 takes the value of the last row where flag = 'n' dynamically.. ?

Thanks heaps!

CodePudding user response:

LAG does not suffice for this task. You need LAST VALUE with a condition and IGNORE NULLS for this:

SELECT 
  ID, col1, flag, date, 
  CASE WHEN flag = 'y' THEN 
    LAST_VALUE(CASE WHEN flag = 'n' THEN col1 END IGNORE NULLS)
      OVER (PARTITION BY id
            ORDER BY date
            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
  ELSE
    col1
  END AS col1_new
FROM my_table
ORDER BY id, date;

This can even be simplified to

SELECT 
  ID, col1, flag, date, 
  LAST_VALUE(CASE WHEN flag = 'n' THEN col1 END IGNORE NULLS)
    OVER (PARTITION BY id ORDER BY date) AS col1_new
FROM my_table
ORDER BY id, date;

because for an 'n' row the value to show is the last 'n' row value, too, if we include this row in the window.

UPDATE

As SQL Server lacks the IGNORE NULLS clause (which renders LAST_VALUE useless here and probably also in almost every other scenario), I see two options to solve this with SQL in SQL Server.

Use a subquery

SELECT 
  ID, col1, flag, date, 
  CASE WHEN flag = 'y' THEN 
  (  
    SELECT TOP(1) col1
    FROM my_table t2
    WHERE t2.id = t.id AND t2.date <= t.date AND t2.flag = 'n'
    ORDER BY date DESC, col1 DESC
  )
  ELSE
    col1
  END AS col1_new
FROM my_table t
ORDER BY id, date, flag, col1;

Iterate through the table with a recursive query

WITH numbered AS
(
  SELECT
    t.*,
    ROW_NUMBER() OVER (PARTITION BY id ORDER BY date, flag, col1) as rn
  FROM my_table t
)
, cte (id, col1, flag, date, rn, last_n_col1) AS
(
  SELECT
    id, col1, flag, date, rn,
    CASE WHEN flag = 'n' THEN col1 ELSE NULL END
  FROM numbered
  WHERE rn = 1
  UNION ALL
  SELECT
    n.id, n.col1, n.flag, n.date, n.rn,
    CASE WHEN n.flag = 'n' THEN n.col1 ELSE cte.last_n_col1 END
  FROM cte
  JOIN numbered n ON n.id = cte.id and n.rn = cte.rn   1
)
SELECT *
FROM cte
ORDER BY id, rn;

CodePudding user response:

If the col1 column is strictly monotonically increasing for the n rows, we can simulate LAST_VALUE IGNORE NULLS using a windowed conditional MAX

SELECT 
  ID, col1, flag, date, 
  MAX(CASE WHEN flag = 'n' THEN col1 END)
    OVER (PARTITION BY id ORDER BY date ROWS UNBOUNDED PRECEDING) AS col1_new
FROM my_table
ORDER BY id, date;
  • Related