I need remove duplicates in SELECT before count them. I need to count how many test have been done. A patient can't get same value for the same test. I had tried DISTINCT
but it would be a GROUP BY(DISTINCT)
instead of GROUP BY(GROUP BY)
. It is pretty much the same but DISTINCT
will remove other columns I need in patient_test table. I used a view as a good practice, but I'm not sure that is helping.
TLDR: I want to improve a nested GROUP BY, but I can't find a way.
table: patient_test (12 rows)
Patient | Test | Value
--------------------------------
Laura | Test A | 1.3
Laura | Test B | 8.9 <-
Laura | Test B | 8.9 <-
Laura | Test C | 8.9
Charles | Test B | 8.9
Charles | Test B | 4.6
Charles | Test C | 5.6
Maria | Test A | 1.2 <--
Maria | Test A | 1.2 <--
Maria | Test B | 1.2
Maria | Test B | 8.9
Maria | Test C | 1.2
I exclude duplicates this way
SELECT * FROM patient_test
GROUP BY Patient , Test, Value
to get:
table: patient_test (10 rows)
Patient | Test | Value
--------------------------------
Laura | Test A | 1.3
Laura | Test B | 8.9
Laura | Test C | 8.9
Charles | Test B | 8.9
Charles | Test B | 4.6
Charles | Test C | 5.6
Maria | Test A | 1.2
Maria | Test B | 1.2
Maria | Test B | 8.9
Maria | Test C | 1.2
Then I count them this way: (What I want to improve)
SELECT TestName, count(*) AS Result FROM
(SELECT * FROM patient_test
GROUP BY Patient , Test, Value) AS tempPatient
GROUP BY tempPatient.TestName;
I'm using a view, but I think that exclude duplicates and them GROUP BY
right away could be better.
SELECT TestName, count(*) AS Result FROM
(CustomView) AS tempPatient
GROUP BY tempPatient.TestName;
This is what I need:
TestName | Result
----------------------
Test A | 2
Test B | 5
Test C | 3
I think that is not necessary to run GROUP BY
twice. I would like to find a way to exclude duplicates using the examples.
Any help or suggestion would be nice.
CodePudding user response:
Seems fine what you have done. I would do something very similar:
SELECT
T.Test,
COUNT(*) AS Result
FROM
(
SELECT DISTINCT
Patient,
Test,
Value
FROM
patient_test
) T
GROUP BY
T.Test
;
If you have duplicates in the table, you have to use grouping or distinct to remove them - that is pretty much standard operating procedure in a case like this.
CodePudding user response:
On MySQL 8
you can use row_number , the partition by patient,test,value
will create unique values. If the same values exists row_num will be >1.
with cte as
( select *, row_number () over(partition by patient,test,value ) as row_num
from patient_test
) select *
from cte;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=089755b482817991b06e58a5caa675e9
To count only unique values add where row_num=1
condition
with cte as
( select *, row_number () over(partition by patient,test,value ) as row_num
from patient_test
) select test,count(value) as nr_count
from cte
where row_num=1
group by test
order by test asc;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=e1841a84eb872d1c047c2c969d80a002
CodePudding user response:
Unlike other databases MySql allows more than one columns referenced with DISTINCT
inside COUNT()
:
SELECT Test, COUNT(DISTINCT Patient, Value) AS Result
FROM patient_test
GROUP BY Test;
See the demo.