Home > Back-end >  Conditional count of rows where at least one peer qualifies
Conditional count of rows where at least one peer qualifies

Time:09-26

Background

I'm a novice SQL user. Using PostgreSQL 13 on Windows 10 locally, I have a table t:

 -- --------- ------- 
|id|treatment|outcome|
 -- --------- ------- 
|a |1        |0      |
|a |1        |1      |
|b |0        |1      |
|c |1        |0      |
|c |0        |1      |
|c |1        |1      |
 -- --------- ------- 

The Problem

I didn't explain myself well initially, so I've rewritten the goal.

Desired result:

 ----------------------- ----- 
|ever treated           |count|
 ----------------------- ----- 
|0                      |1    |
|1                      |3    |
 ----------------------- ----- 

First, identify id that have ever been treated. Being "ever treated" means having any row with treatment = 1.

Second, count rows with outcome = 1 for each of those two groups. From my original table, the ids who are "ever treated" have a total of 3 outcome = 1, and the "never treated", so to speak, have 1 `outcome = 1.

What I've tried

I can get much of the way there, I think, with something like this:

select treatment, count(outcome)
from t
group by treatment;

But that only gets me this result:

 --------- ----- 
|treatment|count|
 --------- ----- 
|0        |2    |
|1        |4    |
 --------- ----- 

CodePudding user response:

For the updated question:

SELECT ever_treated, sum(outcome_ct) AS count
FROM  (
   SELECT id
        , max(treatment) AS ever_treated
        , count(*) FILTER (WHERE outcome = 1) AS outcome_ct
   FROM   t
   GROUP  BY 1
   ) sub
GROUP  BY 1;
 ever_treated | count 
-------------- -------
            0 |     1
            1 |     3

db<>fiddle here

Read:

  • For those who got no treatment at all (all treatment = 0), we see 1 x outcome = 1.
  • For those who got any treatment (at least one treatment = 1), we see 3 x outcome = 1.

Would be simpler and faster with proper boolean values instead of integer.

CodePudding user response:

here is an easy to follow subquery logic that works with integer:

select subq.treatment, count(subq.outcome) 
from (select * from t where outcome = 1 and 
      (id in (select distinct(id) from t where treatment = 1))) as subq 
group by subq.treatment ;
  • Related