Home > database >  COUNT() returns the total number of rows in the grouped table
COUNT() returns the total number of rows in the grouped table

Time:12-31

I have two tables:

Job table:

[![enter image description here][1]][1]

FailedReason table which is referenced to the Job table by :

[![enter image description here][2]][2]

My goal is to compute the failure ratio based on the failure reason.

My expectation is to get a result with the first column containing the failure reason name, the second column containing the total number of all jobs ('successful' 'failed'), the third column containing the total number of failed jobs caused by the reason, and the fourth column containing the failure ratio calculated using the following formula: failed count (3 column) / total count (2 column) * 100.

My sql query :

SELECT
  FailedReason.main_reason as "Failure reason",
  COUNT(job.name) AS "Total jobs",
  SUM(CASE WHEN job.status='failed' THEN 1 ELSE 0 END) AS "Total failed jobs",
  SUM(CASE WHEN job.status='failed' THEN 1 ELSE 0 END) / COUNT(job.name) * 100 AS "Failure ratio"
FROM job
LEFT JOIN FailedReason
ON job.id=FailedReason.job_id
GROUP BY 1
ORDER BY 3 DESC

And I get the result where the total number of jobs from the aggregated table is counted. As a result, the failure rate is one hundred percent.

[![enter image description here][3]][3]

What should I modify to get the correct number of jobs ('succeeded' 'failed') and calculate the correct ratio value

Sample data:

CREATE TABLE failedreason (id INT PRIMARY KEY AUTO_INCREMENT,
                  job_id INT REFERENCES job(id),
                  main_reason varchar(255)
                 );



INSERT INTO failedreason (job_id, main_reason) VALUES (13095427, 'test case failure'),
                                    (13095407, 'test case failure'),
                                    (13095533, 'connection error'),
                                    (13095546, 'connection error'),
                                    (13098367, 'runner connection error'),
                                    (13101522, 'script error');

CREATE TABLE job (id INT PRIMARY KEY,
                  created_at date,
                  finished_at date,
                  status varchar(255)
                 );
INSERT INTO job (id,
                  created_at ,
                  finished_at ,
                  status
                 )
VALUES (13095427,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13095407,  '2021-05-03 02:50:39', '2021-05-03 03:46:41', 'failed'),
       (13095533,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13095546,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13098367,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed'),
       (13101522,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'failed');
       (13101444,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');
       (13101445,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');
       (13101446,  '2021-05-03 02:50:41', '2021-05-03 03:47:27', 'success');



  [1]: https://i.stack.imgur.com/CYnPg.png
  [2]: https://i.stack.imgur.com/t3baS.png
  [3]: https://i.stack.imgur.com/LC7Hp.png

CodePudding user response:

Count the unique jobs, instead of linked failurereasons

SELECT
  FailedReason.main_reason as "Failure reason",
  COUNT(DISTINCT job.id) AS "Total jobs",
  COUNT(DISTINCT CASE WHEN job.status='failed' THEN job.id END) AS "Total failed jobs",
  COUNT(DISTINCT CASE WHEN job.status='failed' THEN job.id END) / MAX(tots.total_jobs) * 100 AS "Failure ratio"
FROM job
CROSS JOIN (SELECT COUNT(*) AS total_jobs FROM job) tots
LEFT JOIN FailedReason
ON job.id = FailedReason.job_id
GROUP BY FailedReason.main_reason
ORDER BY 3 DESC

CodePudding user response:

Your query is not giving correct total and so % are not proper.

Please try this one

select reason, total, failed,
cast( (cast(failed as numeric(10,2))/total) * 100 as numeric(10,2)) failedPercent from 
(SELECT
  FailedReason.reason reason,
  MAX(tots.total_jobs)  total,
  COUNT(DISTINCT CASE WHEN job.status='failed' THEN job.id END) failed
FROM job
CROSS JOIN (SELECT COUNT(*) AS total_jobs FROM job) tots
INNER JOIN FailedReason ON job.id = FailedReason.job_id 
GROUP BY FailedReason.reason) aa

Result on shared query data enter image description here

CodePudding user response:

Use SUM instead of COUNT

SELECT f.main_reason AS "main reason",
       COUNT(*) AS "Total jobs",
       SUM(CASE WHEN j.status='failed' THEN 1 ELSE 0 END) AS "Total failed jobs",
       (SUM(CASE WHEN j.status='failed' THEN 1 ELSE 0 END) / COUNT(*)) * 100 AS "Failure ratio"
FROM job j
LEFT JOIN failedreason f ON j.id = f.job_id
GROUP BY f.main_reason

Demo in db<>fiddle

  • Related