SQL Combination Match Count-CodePudding

My aim is to calculate all possible combinations of the unique entries in column 'role' and then count how many times each combination has the same refid. I've built something and it's working, however it's really slow so I'm wondering if anyone has a better solution?

Data Model

refid	role
1000	xxx
1000	yyy
1001	xxx
1001	yyy

Output Table

a_role	b_role	match_count
xxx	yyy	2

Existing Solution

I've written a stored procedure that performs the following steps:

Generate a list of all combinations of unique roles in column role (97032 combinations)
While loop through all entries of step 1, update the entry with calculated count

Appreciate the help.

CodePudding user response：

You could use a self join and aggregate as the following:

SELECT T.role a_role, D.role b_role, 
       COUNT(*) match_count
FROM table_name T
JOIN table_name D ON T.refid = D.refid AND T.role < D.role
GROUP BY T.role, D.role
ORDER BY T.role, D.role

See a demo.

CodePudding user response：

I set up a little test script, try this out with your data and see if it's any faster. Note this assumes you only have two roles per refid. It'll need a rework if you have more.

declare @input table (
refid int,
descriptor nvarchar(1)
)

insert into @input (refid, descriptor) values
(1000, 'x'),
(1000, 'y'),
(1001, 'x'),
(1001, 'y'),
(1002, 'a'),
(1002, 'b')

select role1, role2, count(*) from (
    select min(descriptor) as role1, max(descriptor) as role2
    from @input
    group by refid
) a 
group by role1, role2