Home > Net >  Count occurrences of each combination
Count occurrences of each combination

Time:08-31

I have 5 conditions (A-E) and a bunch of patient ID's. My data set is 2 columns: PatientID, Condition.

There are duplicate PatientID's with every new condition:

PatientID Condition
456 C
456 E
279 D
123 A
123 C
123 D
187 D
296 E
296 C

I believe there are 31 different potential combinations (order doesn't matter) of those 5 conditions (ie A, AB, ABC, AC, ACDE, etc)

I want to count how many patients in each combination of conditions. So my results for the above would be - CE: 2 D: 2 ACD: 1

I'm more familiar with Excel but if this is better handled in SQL, I can do it there. I think I need to create a table of all the different combinations (any help on that would be appreciated too) and then do a count from there but I'm not sure if that's the best way.

CodePudding user response:

SQL Server solution

SELECT
    conditions,
    COUNT(*)
FROM (
    SELECT
        patientid,
        STRING_AGG(condition, '') WITHIN GROUP (ORDER BY condition) conditions
    FROM tbl
    GROUP BY patientid
) c
GROUP BY conditions 

Output

conditions patient_count
ACD 1
CE 2
D 2

db<>fiddle enter image description here

  • Related