Home > Enterprise >  Returning NULLs for data which does not exist for certain dates [duplicate]
Returning NULLs for data which does not exist for certain dates [duplicate]

Time:09-17

I have a data table similar to the below:

EntityId Date Value
1 5/30/2021 42
1 6/30/2021 35
1 7/31/2021 59
1 8/31/2021 61
2 7/31/2021 98
2 8/31/2021 100
3 8/31/2021 34

I want to return all values in the "Value" column between calendar month end dates 5/31/2021 and 8/31/2021 for each unique entityId. However, each unique entityId does not always have a row for all such dates, in which I would like to return a null in the Value column should the date not exist. Using the query on the table should result in the following data:

EntityId Date Value
1 5/31/2021 42
1 6/30/2021 35
1 7/31/2021 59
1 8/31/2021 59
2 5/31/2021 NULL
2 6/30/2021 NULL
2 7/31/2021 98
2 8/31/2021 100
3 5/31/2021 NULL
3 6/30/2021 NULL
3 7/31/2021 NULL
3 8/31/2021 34

What would be the easiest way to accomplish this?

CodePudding user response:

You need to start with a list of dates and entities. Let me assume they are all in the table. So, the idea is to use cross join to generate the rows in the result set. Then use left join to bring in the remaining data:

select e.entityid, d.date, t.value
from (select distinct entityid from data) e cross join
     (select distinct date from data) d left join
     data t
     on t.entityid = e.entityid and t.date = d.date;

Note: This uses subqueries to generate the dates and entities. If this information is in other tables, you can directly use those tables.

CodePudding user response:

To create the full range of EntityId's and Date's the query uses the CROSS JOIN of distinct values as 'range_cte'. The original data 'data_cte' is then FULL OUTER JOIN'ed with 'range_cte'. Something like this

declare
  @start_dt           date='20210531',
  @end_dt             date='20210831';

;with
data_cte(EntityId, [Date], [Value]) as (
    select *
    from (values (1, cast('20210531' as date), 42),
                 (1, cast('20210630' as date), 35),
                 (1, cast('20210731' as date), 59),
                 (1, cast('20210831' as date), 59),
                 (2, cast('20210731' as date), 98),
                 (2, cast('20210831' as date), 100),
                 (3, cast('20210831' as date), 34)) 
                 v(EntityId, [Date], [Value])),
unq_id_cte(EntityId) as (
    select distinct EntityId
    from data_cte),
nums_cte(n) as (
    select * 
    from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(n)),
range_cte as (
    select top(datediff(month, @start_dt, @end_dt) 1) 
           eomonth(@start_dt, row_number() over (order by (select null))-1) mo_dt
    from nums_cte n1
         cross join nums_cte n2)
select ui.EntityId, coalesce(r.mo_dt, d.[Date]) [Date], d.[Value] 
from
  unq_id_cte ui
  cross join range_cte r
  full join data_cte d on ui.EntityId=d.EntityId
                          and r.mo_dt=d.[Date]
order by ui.EntityId, r.mo_dt;
EntityId    Date        Value
1           2021-05-31  42
1           2021-06-30  35
1           2021-07-31  59
1           2021-08-31  59
2           2021-05-31  NULL
2           2021-06-30  NULL
2           2021-07-31  98
2           2021-08-31  100
3           2021-05-31  NULL
3           2021-06-30  NULL
3           2021-07-31  NULL
3           2021-08-31  34
  • Related