Home > database >  SQL count occurrences without counting duplicates in another column
SQL count occurrences without counting duplicates in another column

Time:12-01

I have a table:

Date       | ID | Company | Click 
----------- ---- --------- --------
01/01/2021 | 01 | Us      | 1
01/01/2021 | 01 | Us      | 1
01/01/2021 | 01 | Other   | 1
01/01/2021 | 02 | Us      | 0 
01/01/2021 | 02 | Other   | 0
02/01/2021 | 03 | Us      | 1 
02/01/2021 | 03 | Us      | 1 
02/01/2021 | 04 | Us      | 0

I want to group by date and count: how many different IDs there are per day, how many unique IDs have clicked=1 and Company="Us"

My current code is:

create table grouped as 
select date
, count(distinct ID) as ID_count
, sum(case when company="Us" and clicked=1 then 1 else 0 end) as Click_count
from have 
group by 1

The result should look like:

Date       | ID_count | Click_count
----------- ---------- ------------
01/01/2021 | 2        | 1
02/01/2021 | 2        | 1

You'll notice that my code counts duplicated ID's, so the click_count column takes the value 2 in both dates. How can I fix that?

CodePudding user response:

You should use COUNT() to count the distinct IDs with a CASE expression:

COUNT(DISTINCT CASE WHEN company = 'Us' AND clicked = 1 THEN ID END) AS click_count 

CodePudding user response:

PostgreSQL solution using filter conditional aggregation:

select date, 
       count(distinct id) id_count,
       count(distinct id) filter (where click = 1 and company = 'Us') click_count 
from the_table
group by date;

If your database lacks the conditional aggregation filter capability then a scalar subquery - an alternative of @forpas suggestion - will do.

select date, 
       count(distinct id) id_count,
       (
         select count(distinct id) 
         from the_table
         where click = 1 and company = 'Us' and date = t.date
       ) click_count 
from the_table t
group by date;

SQL Fiddle

  • Related