Home > front end >  Counting all rows of table once, after a series of Union Alls
Counting all rows of table once, after a series of Union Alls

Time:10-13

Creating data quality rules- and in this case, I'm doing a null check on a few columns of a table. What I'm trying to figure out, is how to get the total record\row count for the table to be queried once(to minimize stress on the server), but populate in all rows of the result set in the column TOTAL_COUNT. I believe that I need a left join- just trying to learn how & where to use it.

So the code that I have, so far is:

SELECT
 CAST('RULE_ID_1') AS VARCHAR(255)) AS RULE_ID
 CAST('DB_NAME_HERE') AS VARCHAR(255)) AS DB_NAME
 CAST('1STCOLUMN') AS VARCHAR(255)) AS COLUMN_NAME
 A.DEFECT_COUNT AS DEFECT_COUNT
FROM
(SELECT COUNT(1) AS DEFECT_COUNT
FROM [Name of Db]
WHERE [Name of column 1] IS NULL
AND DateField = DATEADD(DAY, -1, CAST(GETDATE() AS DATE))) A
UNION ALL

SELECT
 CAST('RULE_ID_2') AS VARCHAR(255)) AS RULE_ID
 CAST('DB_NAME_HERE') AS VARCHAR(255)) AS DB_NAME
 CAST('2NDCOLUMN') AS VARCHAR(255)) AS COLUMN_NAME
 A.DEFECT_COUNT AS DEFECT_COUNT
FROM
(SELECT COUNT(1) AS DEFECT_COUNT
FROM [Name of Db]
WHERE [Name of column 2] IS NULL
AND DateField = DATEADD(DAY, -1, CAST(GETDATE() AS DATE))) A
UNION ALL

SELECT
 CAST('RULE_ID_3') AS VARCHAR(255)) AS RULE_ID
 CAST('DB_NAME_HERE') AS VARCHAR(255)) AS DB_NAME
 CAST('3RDCOLUMN') AS VARCHAR(255)) AS COLUMN_NAME
 A.DEFECT_COUNT AS DEFECT_COUNT
FROM
(SELECT COUNT(1) AS DEFECT_COUNT
FROM [Name of Db]
WHERE [Name of column 3] IS NULL
AND DateField = DATEADD(DAY, -1, CAST(GETDATE() AS DATE))) A

LEFT JOIN
(SELECT 'DB_NAME_HERE' AS DB_NAME, COUNT(*) AS TOTAL_COUNT FROM [Name of Db] WHERE [What goes here?] IS NULL) B
ON
A.DB_NAME=B.DB_NAME

The output| that i need to report out:

RULE_ID DB_NAME COLUMN_NAME DEFECT_COUNT(Nulls) TOTAL_COUNT(#Rows)
RULE_ID_1 DB_NAME_HERE 1STCOLUMN 5 100
RULE_ID_2 DB_NAME_HERE 2NDCOLUMN 0 100
RULE_ID_3 DB_NAME_HERE 3RDCOLUMN 2 100

CodePudding user response:

The way you've laid your query out is creating a lot of extra effort for both you and the SQL Server.

First, You can do most of in in a single query, if you adopt a very slightly different structure using "conditional aggregation"; that is COUNT() with a CASE expression inside it.

Then you can simply add another column for the TOTAL_COUNT. Though, I'm not entirely sure what you want for that total, so I've included a few options...

SELECT
  CAST('DB_NAME_HERE') AS VARCHAR(255)) AS DB_NAME,

  CAST('1STCOLUMN')    AS VARCHAR(255))                   AS RULE_ID_1_COLUMN_NAME,
  COUNT(CASE WHEN [Name of column 1] IS NULL THEN 1 END)  AS RULE_ID_1_DEFECT_COUNT,

  CAST('2NDCOLUMN')    AS VARCHAR(255))                   AS RULE_ID_2_COLUMN_NAME,
  COUNT(CASE WHEN [Name of column 2] IS NULL THEN 1 END)  AS RULE_ID_2_DEFECT_COUNT,

  CAST('3RDCOLUMN')    AS VARCHAR(255))                   AS RULE_ID_3_COLUMN_NAME,
  COUNT(CASE WHEN [Name of column 3] IS NULL THEN 1 END)  AS RULE_ID_3_DEFECT_COUNT,

  COUNT(*)                                                AS TOTAL_DAILY_ROWS,
  COUNT(CASE WHEN [Name of column 1] IS NULL
               OR [Name of column 2] IS NULL 
               OR [Name of column 3] IS NULL THEN 1 END)  AS TOTAL_DAILY_ROWS_WITH_NULLS,
  (SELECT COUNT(*) FROM [Name of Db])                     AS TABLE_TOTAL_ROWS
FROM
  [Name of Db]
WHERE
  DateField = DATEADD(DAY, -1, CAST(GETDATE() AS DATE))

Then, if you really must have your existing format, you can un-pivot it...

WITH
  yesterday_summary AS
(
  SELECT
    CAST('DB_NAME_HERE') AS VARCHAR(255)) AS DB_NAME,
    COUNT(CASE WHEN [Name of column 1] IS NULL THEN 1 END)  AS RULE_ID_1_DEFECT_COUNT,
    COUNT(CASE WHEN [Name of column 2] IS NULL THEN 1 END)  AS RULE_ID_2_DEFECT_COUNT,
    COUNT(CASE WHEN [Name of column 3] IS NULL THEN 1 END)  AS RULE_ID_3_DEFECT_COUNT
  FROM
    [Name of Db]
  WHERE
    DateField = DATEADD(DAY, -1, CAST(GETDATE() AS DATE))
)
SELECT
  y.DB_NAME,
  p.RULE_ID,
  p.COLUMN_NAME,
  p.DEFECT_COUNT,
  (SELECT COUNT(*) FROM [Name of Db])   AS TABLE_TOTAL_ROWS
FROM
  yesterday_summary   AS y
CROSS APPLY
(
  VALUES
    ('RULE_ID_1', '1STCOLUMN', y.RULE_ID_1_DEFECT_COUNT),
    ('RULE_ID_2', '2NDCOLUMN', y.RULE_ID_2_DEFECT_COUNT),
    ('RULE_ID_3', '3RDCOLUMN', y.RULE_ID_3_DEFECT_COUNT)
)
  AS p(RULE_ID, COLUMN_NAME, DEFECT_COUNT)
  • Related