I'm new to SQL, but I have a dataset that has students, their class subjects, and if there was an error in their work. I want to know how many students have at least 1 error in any subject. Thus, whether a student has one subject with an error (like students 2 and 3 in the example) or multiple errors (like student 4), they'd be flagged. Only if they have no errors should they be categorized as 'no'.
I know I have to use GROUP BY and COUNT, and I'm thinking I have to use HAVING as well, but I can't seem to put it together. Here's a sample dataset:
ID Rating Error
==========================================
1 English No
1 Math No
2 English Yes
2 Math No
2 Science No
3 English Yes
4 English Yes
4 Math Yes
And the desired output:
Error Count Percent
==========================================
No 1 .25
Yes 3 .75
CodePudding user response:
there are many different ways you can do it, here is one example by using CTE (common table expressions):
with t as (
select
id,
case when sum(case when error='Yes' then 1 else 0 end) > 0 then 'Yes' else 'No' end as error
from students
group by id
)
select
error,
count(*),
(0.0 count(*)) / (select count(*) from t) as perc
from t
group by error
basically, inner query (t
) is used to calculate error status for each student, outer query calculates error distribution/percentage numbers