Here is an example:
Id|price|Date
1|2|2022-05-21
1|3|2022-06-15
1|2.5|2022-06-19
Needs to look like this:
Id|Date|price
1|2022-05-21|2
1|2022-05-22|2
1|2022-05-23|2
...
1|2022-06-15|3
1|2022-06-16|3
1|2022-06-17|3
1|2022-06-18|3
1|2022-06-19|2.5
1|2022-06-20|2.5
...
Until today
1|2022-08-30|2.5
I tried using the lag(price) over (partition by id order by date) But i can't get it right.
CodePudding user response:
I'm not familiar with Azure, but it looks like you need to use a calendar table, or generate missing dates using a recursive CTE.
To get started with a recursive CTE, you can generate line numbers for each id
(assuming multiple id values) in the source data ordered by date
. These rows with row number equal to 1
(with the minimum date
value for the corresponding id
) will be used as the starting point for the recursion. Then you can use the DATEADD
function to generate the row for the next day. To use the price
values from the original data, you can use a subquery to get the price
for this new date, and if there is no such value (no row for this date
), use the previous price
value from CTE (use the COALESCE
function for this).
For SQL Server query can look like this
WITH cte AS (
SELECT
id,
date,
price
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS rn
FROM tbl
) t
WHERE rn = 1
UNION ALL
SELECT
cte.id,
DATEADD(d, 1, cte.date),
COALESCE(
(SELECT tbl.price
FROM tbl
WHERE tbl.id = cte.id AND tbl.date = DATEADD(d, 1, cte.date)),
cte.price
)
FROM cte
WHERE DATEADD(d, 1, cte.date) <= GETDATE()
)
SELECT * FROM cte
ORDER BY id, date
OPTION (MAXRECURSION 0)
Note that I added OPTION (MAXRECURSION 0)
to make the recursion run through all the steps, since the default value is 100, this is not enough to complete the recursion.
db<>fiddle here
The same approach for MySQL (you need MySQL of version 8.0 to use CTE)
WITH RECURSIVE cte AS (
SELECT
id,
date,
price
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS rn
FROM tbl
) t
WHERE rn = 1
UNION ALL
SELECT
cte.id,
DATE_ADD(cte.date, interval 1 day),
COALESCE(
(SELECT tbl.price
FROM tbl
WHERE tbl.id = cte.id AND tbl.date = DATE_ADD(cte.date, interval 1 day)),
cte.price
)
FROM cte
WHERE DATE_ADD(cte.date, interval 1 day) <= NOW()
)
SELECT * FROM cte
ORDER BY id, date
db<>fiddle here
Both queries produces the same results, the only difference is the use of the engine's specific date functions.
For MySQL versions below 8.0, you can use a calendar table since you don't have CTE support and can't generate the required date range.
Assuming there is a column in the calendar table to store date values (let's call it date
for simplicity) you can use the CROSS JOIN
operator to generate date ranges for the id
values in your table that will match existing dates. Then you can use a subquery to get the latest price
value from the table which is stored for the corresponding date
or before it.
So the query would be like this
SELECT
d.id,
d.date,
(SELECT
price
FROM tbl
WHERE tbl.id = d.id AND tbl.date <= d.date
ORDER BY tbl.date DESC
LIMIT 1
) price
FROM (
SELECT
t.id,
c.date
FROM calendar c
CROSS JOIN (SELECT DISTINCT id FROM tbl) t
WHERE c.date BETWEEN (
SELECT
MIN(date) min_date
FROM tbl
WHERE tbl.id = t.id
)
AND NOW()
) d
ORDER BY id, date
Using my pseudo-calendar table with date
values ranging from 2022-05-20
to 2022-05-30
and source data in that range, like so
id | price | date |
---|---|---|
1 | 2 | 2022-05-21 |
1 | 3 | 2022-05-25 |
1 | 2.5 | 2022-05-28 |
2 | 10 | 2022-05-25 |
2 | 100 | 2022-05-30 |
the query produces following results
id | date | price |
---|---|---|
1 | 2022-05-21 | 2 |
1 | 2022-05-22 | 2 |
1 | 2022-05-23 | 2 |
1 | 2022-05-24 | 2 |
1 | 2022-05-25 | 3 |
1 | 2022-05-26 | 3 |
1 | 2022-05-27 | 3 |
1 | 2022-05-28 | 2.5 |
1 | 2022-05-29 | 2.5 |
1 | 2022-05-30 | 2.5 |
2 | 2022-05-25 | 10 |
2 | 2022-05-26 | 10 |
2 | 2022-05-27 | 10 |
2 | 2022-05-28 | 10 |
2 | 2022-05-29 | 10 |
2 | 2022-05-30 | 100 |
db<>fiddle here