Home > Mobile >  Can SQL STRING_SPLIT use two (or more) separators?
Can SQL STRING_SPLIT use two (or more) separators?

Time:03-06

I have a list of drug names that are stored in various upper and lower case combinations. I need to capitalize the first letter of each word in a string, excluding certain words. The string is separated by spaces, but can also be separated by a forward slash.

The following code works:

create table #exclusionlist (word varchar(25))
create table #drugnames (drugname varchar(50))

insert into #exclusionlist values ('ER')
insert into #exclusionlist values ('HCL')

insert into #drugnames values ('DRUGNAME ER')
insert into #drugnames values ('drugname hcl')
insert into #drugnames values ('ONEDRUG/OTHERDRUG')

select  'Product Name'   = drugname
,       'Product Name 2' = STUFF((SELECT ' '   
                case when value in (select word from #exclusionlist) then upper(value)
                else upper(left(value, 1))   lower(substring(value, 2, len(value))) end
                from STRING_SPLIT(drugname, ' ')
                FOR XML PATH('')) ,1,1,'')
from #drugnames

The output looks like this:

Drugname ER
Drugname HCL
Onedrug/otherdrug

How can I get that last one to look like this:

Onedrug/Otherdrug

I did try STRING_SPLIT(replace(drugname, '/', ' '), ' ') but obviously replaces the slash with a space. And if the slash is at the end of the string like ONEDRUG/OTHERDRUG/ then the result looks like Onedrug Otherdrug 

It's possible that the string may end in a forward slash due to the field only holding N number of characters. When data gets inserted into the table, only the first N characters of the drug name are inserted. If that Nth character is a slash, the string will end in a slash.

CodePudding user response:

You can use a CASE expression for the separator parameter to the STRING_SPLIT function.

In the code below, a common table expression (CTE) uses STRING_SPLIT to split out all the drugname words and capitalize the first letter of each word as appropriate.

The CTE results are unioned together using STRING_AGG to join the drugnames back together. Note that separator parameter for STRING_AGG cannot be an expression. Using a CASE expression results in this error:

Msg 8733, Level 16, State 1, Line 14 Separator parameter for STRING_AGG must be a string literal or variable.

For SQL Server, you would need to be on SQL 2017 or greater for the STRING_AGG function. (I added an id/identity column to the #drugnames temp table to assist with grouping.)

DROP TABLE IF EXISTS #exclusionlist;
DROP TABLE IF EXISTS #drugnames;

CREATE TABLE #exclusionlist (word VARCHAR(25))
CREATE TABLE #drugnames (id INT IDENTITY, drugname VARCHAR(50))

INSERT INTO #exclusionlist VALUES ('ER')
INSERT INTO #exclusionlist VALUES ('HCL')

INSERT INTO #drugnames VALUES ('DRUGNAME ER')
INSERT INTO #drugnames VALUES ('drugname hcl')
INSERT INTO #drugnames VALUES ('ONEDRUG/OTHERDRUG')
    
;WITH SomeStuff AS
(
    select  d.id, d.drugname,
        sp.value AS SplitValue, el.word AS ExclusionListSpaceWord, 
        COALESCE(el.word,
            UPPER(LEFT(sp.value, 1))   LOWER(RIGHT(sp.value, LEN(sp.value) - 1))
        ) AS CapitalizedWord
    from #drugnames d
    CROSS APPLY STRING_SPLIT(d.drugname, CASE WHEN d.drugname LIKE '%/%' THEN '/' ELSE ' ' END ) sp
    LEFT JOIN #exclusionlist el
        ON el.word = sp.value
)
SELECT ss.id, STRING_AGG(ss.CapitalizedWord, '/') AS ReconstructedDrugname 
FROM SomeStuff ss
WHERE ss.drugname LIKE '%/%'
GROUP BY ss.id
UNION
SELECT ss.id, STRING_AGG(ss.CapitalizedWord, ' ') AS ReconstructedDrugname 
FROM SomeStuff ss
WHERE ss.drugname LIKE '% %'
GROUP BY ss.id

Output:

id          ReconstructedDrugname
----------- ----------------------
1           Drugname ER
2           Drugname HCL
3           Onedrug/Otherdrug

CodePudding user response:

If the "/" is replaced to a " /", then the split can still happen on the space.
And the extra space can be removed afterwards.

SELECT  
  [Product Name]   = d.drugname
, [Product Name 2] = ca.drugname2
FROM #drugnames d
CROSS APPLY (
  SELECT REPLACE(LTRIM(x),' /','/') AS drugname2
  FROM
  (
   SELECT ' ' 
     CASE
     WHEN e.word IS NOT NULL 
      THEN e.word
     WHEN s.value LIKE '/%' 
      THEN STUFF(LOWER(s.value),1,2,UPPER(LEFT(s.value,2)))
     ELSE  STUFF(LOWER(s.value),1,1,UPPER(LEFT(s.value,1)))
     END
   FROM STRING_SPLIT(REPLACE(d.drugname,'/',' /'),' ') s
   LEFT JOIN #exclusionlist e
     ON e.word = s.value
   FOR XML PATH('')
  ) q(x)
) ca;
Product Name Product Name 2
DRUGNAME ER Drugname ER
drugname hcl Drugname HCL
ONEDRUG/OTHERDRUG Onedrug/Otherdrug

db<>fiddle enter image description here

Note - using string_split does not, according to the documentation, guarantee the ordering of the values. In practice, I've never seen this be the case and since you're already using the function I'm using here also. There's plenty of ways to split the string while retaining an ordering (using json for example) should it ever prove necessary.

If you are still on 2016 then the string_agg can just be replaced with the for xml implementation you're already using.

  • Related