Home > Enterprise >  How to count rows by date for each of two values?
How to count rows by date for each of two values?

Time:09-17

I have a large dataset in SQL Server that contains two date fields and a foreign key field (among others):

 ---- --------- ------------- ------------- 
| id | type_id | report_date | import_date |
 ---- --------- ------------- ------------- 
|  1 |       1 | 2021-08-01  | 2021-08-02  |
|  2 |       1 | 2021-08-01  | 2021-08-02  |
|  3 |       2 | 2021-08-01  | 2021-08-02  |
|  4 |       2 | 2021-08-04  | 2021-08-05  |
|  5 |       1 | 2021-08-04  | 2021-08-05  |
|  6 |       3 | 2021-08-04  | 2021-08-05  |
|  7 |       2 | 2021-08-04  | 2021-08-04  |
 ---- --------- ------------- ------------- 

I need a query that can count the number of rows for each day, but to count them for each distinct value in the type_id column. The closest I can wrap my brain around right now is returning the total count of all rows for a particular date:

select count(REPORT_DATE) as records,
       REPORT_DATE        as report_date
from MY_TABLE
group by REPORT_DATE;

How do I split this up to return an additional column for each value of type_id? For example:

 ------------ --------- --------- --------- 
|    date    | count_1 | count_2 | count_3 |
 ------------ --------- --------- --------- 
| 2021-08-01 |       2 |       1 |       0 |
| 2021-08-04 |       1 |       2 |       1 |
 ------------ --------- --------- --------- 

I assume I will need at least one subquery, but that is beyond my current knowledge.

CodePudding user response:

You can use a PIVOT or a conditional aggregate.

PIVOT:

;WITH src AS
(
  SELECT report_date, type_id, c = COUNT(*) 
  FROM dbo.MY_TABLE
  GROUP BY report_date, type_id
)
SELECT report_date,
  count_1 = COALESCE([1], 0),
  count_2 = COALESCE([2], 0),
  count_3 = COALESCE([3], 0)
FROM src 
PIVOT (MAX(c) FOR type_id IN ([1],[2],[3])) AS p;

Conditional aggregate:

SELECT report_date,
  count_1 = SUM(CASE WHEN type_id = 1 THEN 1 ELSE 0 END),
  count_2 = SUM(CASE WHEN type_id = 2 THEN 1 ELSE 0 END),
  count_3 = SUM(CASE WHEN type_id = 3 THEN 1 ELSE 0 END)
FROM dbo.MY_TABLE
GROUP BY report_date;

Example db<>fiddle

Use whichever one is more intuitive for you to learn and understand, though I will say conditional aggregate is certainly easier to maintain when, say, type_id = 4 shows up later, as @jarlh suggested in a comment.

  • Related