Home > OS >  SQL Query to Identify Missing Records (NOT IN subquery)
SQL Query to Identify Missing Records (NOT IN subquery)

Time:05-27

I'm trying to write a query which will return every ID (shared value between both document types 6553 and 6554) for print jobs which generated for document type 6553, but not 6554 - these are meant to always generate together.

I've attempted the following, which takes quite a long time to run and produces a number of values which seems too high to me:

select ID from PrintQueueShadow
where DocumentType = '6553'
and CreateDate > getdate() - 7  --No more than 7 days old
and ID not in
(select ID from PrintQueueShadow
where CreateDate > getdate() - 7
and DocumentType = '6554')  --Checking against every ID for Doc Type 6554s sent in the last 7 days

Any help would be much appreciated. Thanks!

CodePudding user response:

Your logic looks correct, but we can improve upon what you have by using exists logic instead. Also, we can try suggesting an index which might speed up the query.

SELECT ID
FROM PrintQueueShadow pqs1
WHERE DocumentType = '6553' AND CreateDate > GETDATE() - 7 AND
      NOT EXISTS (
          SELECT 1
          FROM PrintQueueShadow pqs2
          WHERE pqs2.ID = pqs1.ID AND
                CreateDate > GETDATE() - 7 AND 
                DocumentType = '6554'
      );

Using EXISTS might perform better than WHERE IN (...), as the former lets the database stop searching as soon as it finds a matching record in the subquery.

The above query might benefit from the following index:

CREATE INDEX idx ON PrintQueueShadow (ID, DocumentType, CreateDate);

You may also try permutations of the three columns in the above index.

CodePudding user response:

You can try using self join.

SELECT DISTINCT PQS_1.ID 
FROM PrintQueueShadow AS PQS_1
LEFT JOIN PrintQueueShadow AS PQS_2
     ON  PQS_1.ID = PQS_2.ID
     AND PQS_2.CreateDate > GETDATE() - 7
     AND PQS_2.DocumentType = '6554' --Checking against every ID for Doc Type 6554s sent in the last 7 days
WHERE PQS_1.DocumentType = '6553'
  AND PQS_1.CreateDate > GETDATE() - 7  --No more than 7 days old
  AND PQS_2.ID IS NULL -- Excluding ID which have been found in Doc Type 6553

I've also added DISTINCT statement in case if there're duplicates of ID in a given DocumentType

  • Related