Home > Mobile >  How can I run a mutually exclusive COUNT in Postgres?
How can I run a mutually exclusive COUNT in Postgres?

Time:02-10

Say I have a (virtual) table like the following:

firstColumn secondColumn
fcVal1 scVal1
fcVal1 scVal2
fcVal2 scVal1
fcVal3 scVal2

I want to find the following counts:

  • Values in firstColumn that have both scVal1 and scVal2
  • Values in firstColumn that have only scVal1
  • Values in firstColumn that have only scVal2

So the result would be:

scVals counts
scVal1 scVal2 1
scVal1 1
scVal2 1

Note that these counts are mutually exclusive - fcVal1 is NOT included in the counts for having only scVal1 or scVal2

Preserving which firstColumn vals are associated with the secondColumn vals is not important, just the aggregate counts.

Thanks!

CodePudding user response:

If you are willing to hard code the combinations it's trivial to find out the counts. For example:

with
s as (
  select
    firstColumn,
    count(case when secondColumn = 'scVal1' then 1 end) as c1,
    count(case when secondColumn = 'scVal2' then 1 end) as c2
  from t
  group by firstColumn
)
select 'scVal1', count(*) from s where c1 > 0 and c2 = 0
union all select 'scVal2', count(*) from s where c1 = 0 and c2 > 0
union all select 'scVal1   scVal2', count(*) from s where c1 > 0 and c2 > 0

Result:

 ?column?         count 
 ---------------- ----- 
 scVal1           1     
 scVal2           1     
 scVal1   scVal2  1     

See running example at DB Fiddle.

A more generic solution where the combinations are not known from the beginning is also possible (though more complex) for low numbers of distinct values.

CodePudding user response:

You can use a self join to get a 'scVal1 scVal2' count

select count(distinct t1.firstColumn) n12
from tbl t1
join tbl t2 on t1.firstColumn = t2.firstColumn  
   and t1.secondColumn = 'scVal1' and t2.secondColumn = 'scVal2'

Alternatively, all three counts in a row

select 
   count(f1) n1,
   count(f2) n2,
   count(f1   f2) n12
from (
  select
     max(case secondColumn when 'scVal1' then 1 end) f1,
     max(case secondColumn when 'scVal2' then 1 end) f2
  from tbl
  where secondColumn in ('scVal1', 'scVal2')
  group by firstColumn
) t
  • Related