I have a large dataset in SQL Server that contains two date fields and a foreign key field (among others):
---- --------- ------------- -------------
| id | type_id | report_date | import_date |
---- --------- ------------- -------------
| 1 | 1 | 2021-08-01 | 2021-08-02 |
| 2 | 1 | 2021-08-01 | 2021-08-02 |
| 3 | 2 | 2021-08-01 | 2021-08-02 |
| 4 | 2 | 2021-08-04 | 2021-08-05 |
| 5 | 1 | 2021-08-04 | 2021-08-05 |
| 6 | 3 | 2021-08-04 | 2021-08-05 |
| 7 | 2 | 2021-08-04 | 2021-08-04 |
---- --------- ------------- -------------
I need a query that can count the number of rows for each day, but to count them for each distinct value in the type_id
column. The closest I can wrap my brain around right now is returning the total count of all rows for a particular date:
select count(REPORT_DATE) as records,
REPORT_DATE as report_date
from MY_TABLE
group by REPORT_DATE;
How do I split this up to return an additional column for each value of type_id
? For example:
------------ --------- --------- ---------
| date | count_1 | count_2 | count_3 |
------------ --------- --------- ---------
| 2021-08-01 | 2 | 1 | 0 |
| 2021-08-04 | 1 | 2 | 1 |
------------ --------- --------- ---------
I assume I will need at least one subquery, but that is beyond my current knowledge.
CodePudding user response:
You can use a PIVOT
or a conditional aggregate.
PIVOT:
;WITH src AS
(
SELECT report_date, type_id, c = COUNT(*)
FROM dbo.MY_TABLE
GROUP BY report_date, type_id
)
SELECT report_date,
count_1 = COALESCE([1], 0),
count_2 = COALESCE([2], 0),
count_3 = COALESCE([3], 0)
FROM src
PIVOT (MAX(c) FOR type_id IN ([1],[2],[3])) AS p;
Conditional aggregate:
SELECT report_date,
count_1 = SUM(CASE WHEN type_id = 1 THEN 1 ELSE 0 END),
count_2 = SUM(CASE WHEN type_id = 2 THEN 1 ELSE 0 END),
count_3 = SUM(CASE WHEN type_id = 3 THEN 1 ELSE 0 END)
FROM dbo.MY_TABLE
GROUP BY report_date;
Example db<>fiddle
Use whichever one is more intuitive for you to learn and understand, though I will say conditional aggregate is certainly easier to maintain when, say, type_id = 4
shows up later, as @jarlh suggested in a comment.