SQL query : unique studentid that occur with other studentid in other project for more than 10 times-CodePudding

I have a project table which contains columns studentid, projectid. The students worked in group to complete the project and a students is allowed to participate in more than one project. I wish to query on how many students have partnered up together on 10 or more project.

studentid	projectid
1	abx
2	abx
3	abx
1	abn
4	abn

How do I do a nested query to find out each unique studentid have occurrence with other studentid for more than 10 times?

CodePudding user response：

First self-join the table in order to get the students that worked together. Then aggregate and count.

In order not to look at the same pair twice (s1/s2 <=> s2/s1), I don't join on s1 <> s2, but on s1 < s2.

select t1.studentid, t2.studentid, count(*)
from mytable t1
join mytable t2 on t2.projectid = t1.projectid and t2.studentid > t1.studentid
group by t1.studentid, t2.studentid
having count(*) >= 10;

CodePudding user response：

We do a self-join on project and group by pairs of students.
Where student < student avoids having the same student with themself and the same pairs of students both ways around.
The icing on the cake is a list of projects togehter using group_concat (use string_agg in SQL server.

create table project (
student int,
project varchar(10));

insert into project values
(1,   'abx'),
(2,   'abx'),
(3,   'abx'),
(4,   'abx'),
(1,   'abn'),
(4,   'abn');

select 
  group_concat(a.project) projects,
  a.student student_1, 
  b.student student_2,
  count(a.project) Number_projects
from project a
join project b on a.project = b.project
where a.student < b.student
group by
 a.student,
 b.student;

projects | student_1 | student_2 | Number_projects :------- | --------: | --------: | --------------: abx | 1 | 2 | 1 abx | 1 | 3 | 1 abn,abx | 1 | 4 | 2 abx | 2 | 3 | 1 abx | 2 | 4 | 1 abx | 3 | 4 | 1

db<>fiddle here