I have a data table similar to the below:
EntityId | Date | Value |
---|---|---|
1 | 5/30/2021 | 42 |
1 | 6/30/2021 | 35 |
1 | 7/31/2021 | 59 |
1 | 8/31/2021 | 61 |
2 | 7/31/2021 | 98 |
2 | 8/31/2021 | 100 |
3 | 8/31/2021 | 34 |
I want to return all values in the "Value" column between calendar month end dates 5/31/2021 and 8/31/2021 for each unique entityId. However, each unique entityId does not always have a row for all such dates, in which I would like to return a null in the Value column should the date not exist. Using the query on the table should result in the following data:
EntityId | Date | Value |
---|---|---|
1 | 5/31/2021 | 42 |
1 | 6/30/2021 | 35 |
1 | 7/31/2021 | 59 |
1 | 8/31/2021 | 59 |
2 | 5/31/2021 | NULL |
2 | 6/30/2021 | NULL |
2 | 7/31/2021 | 98 |
2 | 8/31/2021 | 100 |
3 | 5/31/2021 | NULL |
3 | 6/30/2021 | NULL |
3 | 7/31/2021 | NULL |
3 | 8/31/2021 | 34 |
What would be the easiest way to accomplish this?
CodePudding user response:
You need to start with a list of dates and entities. Let me assume they are all in the table. So, the idea is to use cross join
to generate the rows in the result set. Then use left join
to bring in the remaining data:
select e.entityid, d.date, t.value
from (select distinct entityid from data) e cross join
(select distinct date from data) d left join
data t
on t.entityid = e.entityid and t.date = d.date;
Note: This uses subqueries to generate the dates and entities. If this information is in other tables, you can directly use those tables.
CodePudding user response:
To create the full range of EntityId's and Date's the query uses the CROSS JOIN of distinct values as 'range_cte'. The original data 'data_cte' is then FULL OUTER JOIN'ed with 'range_cte'. Something like this
declare
@start_dt date='20210531',
@end_dt date='20210831';
;with
data_cte(EntityId, [Date], [Value]) as (
select *
from (values (1, cast('20210531' as date), 42),
(1, cast('20210630' as date), 35),
(1, cast('20210731' as date), 59),
(1, cast('20210831' as date), 59),
(2, cast('20210731' as date), 98),
(2, cast('20210831' as date), 100),
(3, cast('20210831' as date), 34))
v(EntityId, [Date], [Value])),
unq_id_cte(EntityId) as (
select distinct EntityId
from data_cte),
nums_cte(n) as (
select *
from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(n)),
range_cte as (
select top(datediff(month, @start_dt, @end_dt) 1)
eomonth(@start_dt, row_number() over (order by (select null))-1) mo_dt
from nums_cte n1
cross join nums_cte n2)
select ui.EntityId, coalesce(r.mo_dt, d.[Date]) [Date], d.[Value]
from
unq_id_cte ui
cross join range_cte r
full join data_cte d on ui.EntityId=d.EntityId
and r.mo_dt=d.[Date]
order by ui.EntityId, r.mo_dt;
EntityId Date Value
1 2021-05-31 42
1 2021-06-30 35
1 2021-07-31 59
1 2021-08-31 59
2 2021-05-31 NULL
2 2021-06-30 NULL
2 2021-07-31 98
2 2021-08-31 100
3 2021-05-31 NULL
3 2021-06-30 NULL
3 2021-07-31 NULL
3 2021-08-31 34