I'm writing data quality rules, and in this case- am looking to return duplicate values for the column called "ID". The problem is that, when there are no duplicates, null is returned instead of 0. I need 0 to be returned for the non-dupes. I've tried putting the ISNULL near the SELECT ID and SELECT COUNT(ID) statements, but have had no success. I simply can't figure out how where to put the ISNULL statement(if that's accurate to even use), or if COALESCE should be used, or if I need an outer join(according to a similar question). Any help is greatly appreciated. Thanks!
SELECT
CAST('DQR_ID_HERE') AS VARCHAR(255)) AS DQR_ID
CAST('DB_NAME_HERE') AS VARCHAR(255)) AS DB_NAME
CAST('ID') AS VARCHAR(255)) AS COLUMN_TO_DUPE_CHECK
SUM (DEFECT.DEFECT_COUNT) AS DEFECT_COUNT
FROM (
SELECT
ID,
COUNT(ID) AS DEFECT_COUNT
FROM DB_NAME_HERE
GROUP BY ID
HAVING (COUNT(ID)) > 1
) AS DEFECT
CodePudding user response:
You just need to place ISNULL
around the SUM
ISNULL(SUM (DEFECT.DEFECT_COUNT), 0)
Do not place ISNULL
inside the SUM
, because then you will still get NULL
if there are no rows to sum.