My aim is to calculate all possible combinations of the unique entries in column 'role' and then count how many times each combination has the same refid. I've built something and it's working, however it's really slow so I'm wondering if anyone has a better solution?
Data Model
refid | role |
---|---|
1000 | xxx |
1000 | yyy |
1001 | xxx |
1001 | yyy |
Output Table
a_role | b_role | match_count |
---|---|---|
xxx | yyy | 2 |
Existing Solution
I've written a stored procedure that performs the following steps:
- Generate a list of all combinations of unique roles in column role (97032 combinations)
- While loop through all entries of step 1, update the entry with calculated count
Appreciate the help.
CodePudding user response:
You could use a self join and aggregate as the following:
SELECT T.role a_role, D.role b_role,
COUNT(*) match_count
FROM table_name T
JOIN table_name D ON T.refid = D.refid AND T.role < D.role
GROUP BY T.role, D.role
ORDER BY T.role, D.role
See a demo.
CodePudding user response:
I set up a little test script, try this out with your data and see if it's any faster. Note this assumes you only have two roles per refid. It'll need a rework if you have more.
declare @input table (
refid int,
descriptor nvarchar(1)
)
insert into @input (refid, descriptor) values
(1000, 'x'),
(1000, 'y'),
(1001, 'x'),
(1001, 'y'),
(1002, 'a'),
(1002, 'b')
select role1, role2, count(*) from (
select min(descriptor) as role1, max(descriptor) as role2
from @input
group by refid
) a
group by role1, role2