I need to fill missing date values in orders table. DDL:
create table orders(order_date date, order_value int)
insert into orders values('2022-11-01',100),('2022-11-04 ',200),('2022-11-08',300)
Expected output is as:
order_date | order_value
-----------------------
2022-11-01 | 100
2022-11-02 | 100
2022-11-03 | 100
2022-11-04 | 200
2022-11-05 | 200
2022-11-06 | 200
2022-11-07 | 200
2022-11-08 | 300
I have solved the problem in ms sql using recursive query listed below.
with cte as (
select min(order_date) [min_date], MAX(order_date) [max_date]
FROM orders
), cte2 AS(
SELECT min_date [date]
FROM cte
UNION ALL
SELECT dateadd(day,1,date) [date]
FROM cte2
WHERE date < (SELECT max_date FROM cte)
), cte3 as(
select date [order_date], order_value
FROM cte2
LEFT JOIN orders on date = order_date
)
SELECT order_date,
FIRST_VALUE(order_value) IGNORE NULLS
OVER(ORDER BY order_date desc ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) [order_value]
FROM cte3
Is there any alternate approach to solve this problem or any way to optimize the recursive query?
Thanks in advance.
CodePudding user response:
As mentioned in the comments, as you are on SQL Server 2022 you can make use of GENERATE_SERIES
here, which will be much more performant that a recursive Common Table Expression (rCTE). On prior versions, you can use an (inline) tally or a Calendar table.
WITH Dates AS(
SELECT MIN(order_date) AS MinDate,
MAX(order_date) AS MaxDate
FROM dbo.orders),
DateSeries AS(
SELECT DATEADD(DAY,GS.value,D.MinDate) AS Date
FROM Dates D
CROSS APPLY GENERATE_SERIES(0,DATEDIFF(DAY,MinDate,MaxDate),1) GS)
SELECT DS.Date,
FIRST_VALUE(O.order_value) IGNORE NULLS OVER (ORDER BY DS.Date desc ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS [order_value]
FROM DateSeries DS
LEFT JOIN dbo.orders O ON DS.Date = O.order_date
ORDER BY DS.Date;