My data is as follows
when flag = 'y' I want col1 to be updated with the last flag = 'n' row. I want the col1 values in row num 5 & 6 to be updated with the last row where flag = n, i.e row 4 & similarly row 10,11 should be updated with row 9.
the data is sored by date in ascending order. I was able to use
I was able to use lag function where only row num 5 gets updated with row num 4 value, but row 6 takes row 5 value which is incorrect.
SELECT ID, col1, flag, date,
CASE WHEN flag = 'y' THEN LAG(col1, 1) OVER (PARTITION BY ID ORDER BY date)
WHEN flag = 'n' THEN col1 END as col1_new
FROM my_table;
Could you please help me solve this so that - col1 takes the value of the last row where flag = 'n' dynamically.. ?
Thanks heaps!
CodePudding user response:
LAG
does not suffice for this task. You need LAST VALUE
with a condition and IGNORE NULLS
for this:
SELECT
ID, col1, flag, date,
CASE WHEN flag = 'y' THEN
LAST_VALUE(CASE WHEN flag = 'n' THEN col1 END IGNORE NULLS)
OVER (PARTITION BY id
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
ELSE
col1
END AS col1_new
FROM my_table
ORDER BY id, date;
This can even be simplified to
SELECT
ID, col1, flag, date,
LAST_VALUE(CASE WHEN flag = 'n' THEN col1 END IGNORE NULLS)
OVER (PARTITION BY id ORDER BY date) AS col1_new
FROM my_table
ORDER BY id, date;
because for an 'n' row the value to show is the last 'n' row value, too, if we include this row in the window.
UPDATE
As SQL Server lacks the IGNORE NULLS
clause (which renders LAST_VALUE
useless here and probably also in almost every other scenario), I see two options to solve this with SQL in SQL Server.
Use a subquery
SELECT
ID, col1, flag, date,
CASE WHEN flag = 'y' THEN
(
SELECT TOP(1) col1
FROM my_table t2
WHERE t2.id = t.id AND t2.date <= t.date AND t2.flag = 'n'
ORDER BY date DESC, col1 DESC
)
ELSE
col1
END AS col1_new
FROM my_table t
ORDER BY id, date, flag, col1;
Iterate through the table with a recursive query
WITH numbered AS
(
SELECT
t.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date, flag, col1) as rn
FROM my_table t
)
, cte (id, col1, flag, date, rn, last_n_col1) AS
(
SELECT
id, col1, flag, date, rn,
CASE WHEN flag = 'n' THEN col1 ELSE NULL END
FROM numbered
WHERE rn = 1
UNION ALL
SELECT
n.id, n.col1, n.flag, n.date, n.rn,
CASE WHEN n.flag = 'n' THEN n.col1 ELSE cte.last_n_col1 END
FROM cte
JOIN numbered n ON n.id = cte.id and n.rn = cte.rn 1
)
SELECT *
FROM cte
ORDER BY id, rn;
CodePudding user response:
If the col1
column is strictly monotonically increasing for the n
rows, we can simulate LAST_VALUE
IGNORE NULLS
using a windowed conditional MAX
SELECT
ID, col1, flag, date,
MAX(CASE WHEN flag = 'n' THEN col1 END)
OVER (PARTITION BY id ORDER BY date ROWS UNBOUNDED PRECEDING) AS col1_new
FROM my_table
ORDER BY id, date;