Home > Mobile >  Splitting a column containing varying number of values into multiple columns, SQL Server
Splitting a column containing varying number of values into multiple columns, SQL Server

Time:03-11

I have the following table named Company, it has many columns but this is the problematic one:

Team
Frodo B (manager), Gandalf G (director), Batman C (cleaner)
John Doe (secretary), Mark Jacobs(manager), Lilly Hopes(director), Rihanna Williams (cleaner), Maddy James (supervisor), Merry Poppins (HR)
Rick Ross (cleaner)
Orlando Bloom (manager), Keira Knightly (secretary)

Every row has different number of people followed by their position in the company in brackets. We know that there are 7 possible positions: cleaner, director, manager, supervisor, secretary, HR, owner

The required result is this column to be replaced by multiple columns - one for each type of position and for each row if there was someone with this position his/her name to appear in the appropriate column. The position which is in brackets is no longer needed once names fall in the right columns, I am just leaving it in the result table to better illustrate the result. The rest of the cells can be blank N/A or whatever.

cleaner director manager supervisor secretary HR owner
Batman C Gandalf G Frodo B
Rihanna Williams (cleaner) Lilly Hopes(director) Mark Jacobs(manager) Maddy James (supervisor) John Doe (secretary) Merry Poppins (HR)
Rick Ross (cleaner)
Orlando Bloom (manager) Keira Knightly (secretary)

I know how to do this in python but I need to make it happen in SQL Server, which is not one of my strengths unfortunately. I looked up similar questions but I cannot make it work. Please help.

CodePudding user response:

PIVOT relational operator is an option:

Sample data:

SELECT *
INTO Company
FROM (VALUES
   (1, 'Frodo B (manager), Gandalf G (director), Batman C (cleaner)'),
   (2, 'John Doe (secretary), Mark Jacobs (manager), Lilly Hopes (director), Rihanna Williams (cleaner), Maddy James (supervisor), Merry Poppins (HR)'),
   (3, 'Rick Ross (cleaner)'),
   (4, 'Orlando Bloom (manager), Keira Knightly (secretary)')
) d (Id, Team)

Statement:

SELECT *
FROM (
   SELECT c.Id, LTRIM(s.[value]) AS [Name], p.Position
   FROM Company c
   CROSS APPLY STRING_SPLIT(c.Team, ',') s
   JOIN (VALUES
      ('cleaner'), ('director'), ('manager'), ('supervisor'), ('secretary'), ('HR'), ('owner')
   ) p (Position) ON CHARINDEX(p.Position, s.[value]) > 0
) t
PIVOT (
   MAX(Name) 
   FOR Position IN ([cleaner], [director], [manager], [supervisor], [secretary], [HR], [owner])
) p

Result:

Id cleaner director manager supervisor secretary HR owner
1 Batman C (cleaner) Gandalf G (director) Frodo B (manager)
2 Rihanna Williams (cleaner) Lilly Hopes (director) Mark Jacobs (manager) Maddy James (supervisor) John Doe (secretary) Merry Poppins (HR)
3 Rick Ross (cleaner)
4 Orlando Bloom (manager) Keira Knightly (secretary)

CodePudding user response:

Recursion is another (knowing that it is limited to 7 iterations):

with MyTable as (
select *
from (values 
 (1, 'Frodo B (manager), Gandalf G (director), Batman C (cleaner)')
,(2, 'John Doe (secretary), Mark Jacobs(manager), Lilly Hopes(director), Rihanna Williams (cleaner), Maddy James (supervisor), Merry Poppins (HR)')
,(3, 'Rick Ross (cleaner)')
,(4,'Orlando Bloom (manager), Keira Knightly (secretary)')
) T (Id, Team)
),
RCTE as ( -- Find closing bracket positions
    select Id, FromPos=1 , BctPos=charindex(')',Team), Team from MyTable
    union all
    select R.Id, FromPos=R.BctPos 2, BctPos=charindex(')', M.Team, R.BctPos 1), M.Team 
    from MyTable M 
         inner join
         RCTE R
         on R.Id=M.Id
    where charindex(')',M.Team,R.BctPos 1)>0
),
Parsed as
( select Id, Team, Member=SubString(Team,FromPos,BctPos-FromPos 1)  from RCTE)
select 
  Id
, director=min(case when Member like '%(director)' then left(Member,len(Member)- charindex('(',Member)-1) end)
, cleaner=min(case when Member like '%(cleaner)' then left(Member,len(Member)- charindex('(',Member)-1) end)
, manager=min(case when Member like '%(manager)' then left(Member,len(Member)- charindex('(',Member)-1) end)
, supervisor=min(case when Member like '%(supervisor)' then left(Member,len(Member)- charindex('(',Member)-1) end)
, secretary=min(case when Member like '%(secretary)' then left(Member,len(Member)- charindex('(',Member)-1) end)
, owner=min(case when Member like '%(owner)' then left(Member,len(Member)- charindex('(',Member)-1) end)
from Parsed 
group by id
order by id

I assumed that they are all well formed (brackets are correctly opened closed; and blanks are used consistently, etc.

  • Related