This is using SQL Server 2019
Difficult to explain; I have two tables:
DATES:
DATE |
---|
01/01/2022 |
02/01/2022 |
INSURANCE_COVER:
INSURANCE_COVER_ID | INSURANCE_DATE |
---|---|
1 | 02/01/2022 |
1 | 03/01/2022 |
2 | 31/12/2021 |
2 | 01/01/2022 |
I need to get all rows from INSURANCE_COVER
where the INSURANCE_COVER_ID
does not have a INSURANCE_DATE
in DATES
and I need the missing DATE
to show.
I need the returned results to look like this:
DATE | INSURANCE_ID |
---|---|
01/01/2022 | 1 |
02/01/2022 | 2 |
I have only been able to do this with a while loop, but the performance is awful.
CodePudding user response:
We can build the list of desired values with a cross join, and then use an exclusion join to find what is missing from that set:
SELECT d.[Date], IDs.INSURANCE_COVER_ID
FROM [dates] d
CROSS JOIN (SELECT DISTINCT INSURANCE_COVER_ID FROM INSURANCE_COVER) IDs
LEFT JOIN INSURANCE_COVER C ON C.INSURANCE_COVER_ID = IDs.INSURANCE_COVER_ID
AND C.Insurance_Date = d.[Date]
WHERE C.INSURANCE_COVER_ID IS NULL
See it here:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d26d09494a0d6b06623a35610170c8eb
We could probably make this even faster with a NOT EXISTS()
instead of an exclusion join, but my personal history is I find it much easier and faster to write the exclusion join, and the performance difference doesn't always justify the additional initial time investment.
As a side note, the date formats in that data are WRONG. Different languages and cultures expect dates formatted in different ways. Some like MM/dd/yyyy. Some like dd-MM-yyyy. Others might prefer yyyy-MM-dd.
The SQL language is no different: it has it's own expectations around how dates should look, and when writing for SQL you should use that format. When writing a date value with no time, it looks like this: yyyyMMdd
(note the complete lack of separators in addition to the ordering). When including a time value, it looks like this: yyyy-MM-ddTHH:mm:ss[.fff]
.