Home > Mobile >  Oracle | Group by to fetch unique comma separated values
Oracle | Group by to fetch unique comma separated values

Time:04-27

I have a table with a skills column having comma separated values. The skills can have duplicates per student. I need to use group by to list unique skills per student.

What should I be using so that I get list with no duplicates. Please help.

Input Table student_skills:

STUDENT_ID SEMESTER SKILLS
101 1 C, SQL
101 2 C, CPP
102 1 CPP, Java
102 2 Java, JavaScript

Desired result:

STUDENT_ID SKILLS
101 C, SQL, CPP
102 CPP, Java, JavaScript

SQL Queries to create table, insert data:

create table student_skills
(STUDENT_ID number(10),
SEMESTER varchar2(5),
SKILLS varchar2(50));

insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (101, '1', 'C, SQL');
insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (101, '2', 'C, CPP');
insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (102, '1', 'CPP, Java');
insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (102, '2', 'Java, JavaScript');
commit;

CodePudding user response:

One option is to split skills into rows (the temp CTE), and then aggregate them back (line #11):

SQL> with temp as
  2    (select distinct
  3            student_id,
  4            trim(regexp_substr(skills, '[^,] ', 1, column_value)) skill
  5     from student_skills cross join
  6       table(cast(multiset(select level from dual
  7                           connect by level <= regexp_count(skills, ',')   1
  8                          ) as sys.odcinumberlist))
  9    )
 10  select student_id,
 11         listagg(skill, ', ') within group (order by skill) skills
 12  from temp
 13  group by student_id;

STUDENT_ID SKILLS
---------- ------------------------------
       101 C, CPP, SQL
       102 CPP, Java, JavaScript

SQL>

CodePudding user response:

You can use a subquery along with DISTINCT clause while extracting the substrings seperated by comma, then apply LISTAGG() function to recombine the pieces such as

WITH s AS
(
 SELECT DISTINCT student_id, REGEXP_SUBSTR(skills,'[^ ,] ',1,level) AS skills
   FROM student_skills
CONNECT BY level <= REGEXP_COUNT(skills,',') 1
    AND PRIOR student_id = student_id
    AND PRIOR sys_guid() IS NOT NULL 
)
SELECT student_id, LISTAGG(skills,', ') WITHIN GROUP (ORDER BY 0) AS skills
  FROM s
 GROUP BY student_id 

Demo

  • Related