Home > OS >  How to Count multiple columns of a table, that could have nulls
How to Count multiple columns of a table, that could have nulls

Time:08-18

I have a table in SQL Server with following structure:

============================================================================================
Id    ReferenceNo   ApprovedBy    ApprovedOn           RejectedBy     RejectedOn
============================================================================================
1      R123456      admin     2022-08-17 14:22:19.243     u1      2022-08-15 18:50:49.547
2      R128456      NULL      NULL                        u2      2022-08-12 17:20:29.873
3      R129456      u3        2022-08-19 16:50:29.123     u1      2022-08-15 18:50:49.923
4      R125456      admin     2022-08-17 11:33:49.243     NULL    NULL
5      R127456      u2        2022-08-15 10:19:29.103     u1      2022-08-15 18:34:26.713

I am trying to count the approved and rejected records in this table by usernames in ApprovedBy and RejectedBy columns. The following output is what I am looking for:

=============================================
Username    Approved    Rejected
=============================================
admin       2             0
u1          0             3
u2          1             1
u3          1             0

I have tried this query, but due to a null value in ApprovedBy not getting the count for RejectedBy for some users.

    SELECT ApprovedBy 'Username', COALESCE(COUNT(Id), 0) 'Approved', 
    COALESCE((SELECT COUNT(*) FROM tblItems rej
           WHERE RejectedOn >= '06/01/2022 00:00:00.000' 
             AND RejectedOn <= '08/17/2022 23:59:59.997' 
             AND RejectedBy is not null 
             AND app.ApprovedBy=rej.RejectedBy
           GROUP BY RejectedBy
         ),0) 'Rejected' FROM tblItems app
   WHERE ApprovedOn >= '06/01/2022 00:00:00.000' 
    AND  ApprovedOn <= '08/17/2022 23:59:59.997' 
    AND ApprovedByIS NOT NULL
   GROUP BY ApprovedBy

CodePudding user response:

Use unambiguous dates and avoid trying to find the "end" of the day, because that is prone to breaking if the data types change anywhere. Much safer to use less than the next day (more details here). Anyway one way to solve the problem you're having is to use conditional aggregation:

DECLARE @start date = '20220601',
        @end   date = '20220818'; 

SELECT u.Username, 
  Approved = SUM(CASE WHEN i.ApprovedBy = u.Username THEN 1 ELSE 0 END),
  Rejected = SUM(CASE WHEN i.RejectedBy = u.Username THEN 1 ELSE 0 END)
FROM dbo.Users AS u
LEFT OUTER JOIN dbo.tblItems AS i
ON  
(
  (
    u.Username = i.ApprovedBy 
    AND i.ApprovedOn >= @start 
    AND i.ApprovedOn < @end
  )
  OR 
  (
    u.Username = i.RejectedBy
    AND i.RejectedOn >= @start 
    AND i.RejectedOn < @end
  )
)
GROUP BY u.Username;

This also corrects a logic error in your current query that uses the rejected date to credit u3 with an approval that happened outside your desired range, and includes rows from Users that don't have any approvals or rejections.

If you don't want to include users that are in the Users table and don't have any matching rows in tblItems, just change LEFT OUTER JOIN to INNER JOIN.

CodePudding user response:

If the conditional logic of timeframes doesn't really play any role, the result can be achieved by unpivoting approved and rejected columns, then pivoting approved/rejected rows to columns.

 with cte as (
  select id,
  cast(approvedby as varchar(30)) as approved,
  cast(rejectedby as varchar(30)) as rejected
  from 
  (values
(1, 'R123456', 'admin', '2022-08-17 14:22:19.243', 'u1', '2022-08-15 18:50:49.547'),
(2, 'R128456',  NULL,    NULL,                     'u2', '2022-08-12 17:20:29.873'),
(3, 'R129456', 'u3',    '2022-08-19 16:50:29.123', 'u1', '2022-08-15 18:50:49.923'),
(4, 'R125456', 'admin', '2022-08-17 11:33:49.243',  NULL, NULL),
(5, 'R127456', 'u2',    '2022-08-15 10:19:29.103', 'u1', '2022-08-15 18:34:26.713')
  ) as t(Id, ReferenceNo, ApprovedBy, ApprovedOn, RejectedBy, RejectedOn)
  )
select 
*
from cte
unpivot (username for y in (approved,rejected)) upv
pivot (count(id) for y in ([approved],[rejected])) pv
  • Related