How can I run a mutually exclusive COUNT in Postgres?-CodePudding

Say I have a (virtual) table like the following:

firstColumn	secondColumn
fcVal1	scVal1
fcVal1	scVal2
fcVal2	scVal1
fcVal3	scVal2

I want to find the following counts:

Values in firstColumn that have both scVal1 and scVal2
Values in firstColumn that have only scVal1
Values in firstColumn that have only scVal2

So the result would be:

scVals	counts
scVal1 scVal2	1
scVal1	1
scVal2	1

Note that these counts are mutually exclusive - fcVal1 is NOT included in the counts for having only scVal1 or scVal2

Preserving which firstColumn vals are associated with the secondColumn vals is not important, just the aggregate counts.

Thanks!

CodePudding user response：

If you are willing to hard code the combinations it's trivial to find out the counts. For example:

with
s as (
  select
    firstColumn,
    count(case when secondColumn = 'scVal1' then 1 end) as c1,
    count(case when secondColumn = 'scVal2' then 1 end) as c2
  from t
  group by firstColumn
)
select 'scVal1', count(*) from s where c1 > 0 and c2 = 0
union all select 'scVal2', count(*) from s where c1 = 0 and c2 > 0
union all select 'scVal1   scVal2', count(*) from s where c1 > 0 and c2 > 0

Result:

 ?column?         count 
 ---------------- ----- 
 scVal1           1     
 scVal2           1     
 scVal1   scVal2  1

See running example at DB Fiddle.

A more generic solution where the combinations are not known from the beginning is also possible (though more complex) for low numbers of distinct values.

CodePudding user response：

You can use a self join to get a 'scVal1 scVal2' count

select count(distinct t1.firstColumn) n12
from tbl t1
join tbl t2 on t1.firstColumn = t2.firstColumn  
   and t1.secondColumn = 'scVal1' and t2.secondColumn = 'scVal2'

Alternatively, all three counts in a row

select 
   count(f1) n1,
   count(f2) n2,
   count(f1   f2) n12
from (
  select
     max(case secondColumn when 'scVal1' then 1 end) f1,
     max(case secondColumn when 'scVal2' then 1 end) f2
  from tbl
  where secondColumn in ('scVal1', 'scVal2')
  group by firstColumn
) t