There are two tables. Tables A has following structure
ID | Flag | Name |
---|---|---|
1X | 1 | Y |
2Y | 0 | Null |
3Z | 1 | Null |
4A | 1 | Y |
Table B has the following structure
B_ID | City | State |
---|---|---|
1X | Y | Null |
2Y | Null | Null |
3Z | Null | Y |
4A | Y | Y |
I want to get the count of all the IDs and the count of IDs that have Null in any of the columns (name, city, state), for e.g from the tables above only the ID 4A has non null value in all the three columns across both the tables, so the output should be like
Total_Count | Ids having null | Percentage missing |
---|---|---|
4 | 3 | 0.75% |
Total_count is 4 as there are total of four IDs, ID having NULL is 3 because there are 3 IDs that have null in any of the three columns (viz. name,city,state), and Percentage missing is just IDs having null / Total_Count.
I tried using a query along the following lines
select (count/total) * 100 pct, count,total
from (select sum(count) count
from(select count(*) count from tableA T1
where T1.name is null
union all
select count(*) count from tableA T1
join tableB T2 on T1.ID = T2.B_ID
where T2.city is null
union all
select count(*) count from tableA T1
join tableB T2 on T1.ID = T2.B_ID
where T2.state is null)),
select count(ID) total from tableA);
But the query is not returning the desired output, can you please suggest me a better way? Thank You
CodePudding user response:
Use conditional aggregation:
SELECT COUNT(*) Total_Count,
COUNT(CASE WHEN t1.Name IS NULL OR t2.City IS NULL OR t2.State IS NULL THEN 1 END) Ids_having_null,
AVG(CASE WHEN COALESCE(t1.Name, t2.City, t2.State) IS NOT NULL THEN 1 ELSE 0 END) Percentage_missing
FROM Table1 t1 INNER JOIN Table2 t2
ON t2.B_ID = t1.ID;
See the demo.
CodePudding user response:
If you don't know if either table has all the ID's?
Then I suggest a full join
with some conditional aggregation in a sub-query.
For example:
select Total as "Total_Count" , TotalMissing as "Ids having null" , (TotalMissing / Total)*100||'%' as "Percentage missing" from ( select count(distinct coalesce(a.ID, b.B_ID)) as Total , count(distinct case when a.name is null or b.city is null or b.state is null then coalesce(a.ID, b.B_ID) end) as TotalMissing from TableA a full outer join TableB b on a.ID = b.B_ID ) q
Total_Count | Ids having null | Percentage missing ----------: | --------------: | :----------------- 4 | 3 | 75%
db<>fiddle here
CodePudding user response:
Try this ->
select total_count, (total_count - cn) as ids_having_null,
(total_count - cn) *100 / total_count as Percentage_missing
FROM
(select count(t1.ID) as cn , t1.total_count
FROM ( select ID,Name, sum(tmp_col) over ( order by tmp_col) as total_count from (select ID,Name, 1 as tmp_col from tableA ) ) t1
JOIN TableB t2
ON t1.ID = t2.B_ID
WHERE t1.Name is not null and t2.City is not null and t2.State is not null );
Based on your requirement for percentage or ratio, you can alter the logic for Percentage_missing column