I have a list of drug names that are stored in various upper and lower case combinations. I need to capitalize the first letter of each word in a string, excluding certain words. The string is separated by spaces, but can also be separated by a forward slash.
The following code works:
create table #exclusionlist (word varchar(25))
create table #drugnames (drugname varchar(50))
insert into #exclusionlist values ('ER')
insert into #exclusionlist values ('HCL')
insert into #drugnames values ('DRUGNAME ER')
insert into #drugnames values ('drugname hcl')
insert into #drugnames values ('ONEDRUG/OTHERDRUG')
select 'Product Name' = drugname
, 'Product Name 2' = STUFF((SELECT ' '
case when value in (select word from #exclusionlist) then upper(value)
else upper(left(value, 1)) lower(substring(value, 2, len(value))) end
from STRING_SPLIT(drugname, ' ')
FOR XML PATH('')) ,1,1,'')
from #drugnames
The output looks like this:
Drugname ER
Drugname HCL
Onedrug/otherdrug
How can I get that last one to look like this:
Onedrug/Otherdrug
I did try STRING_SPLIT(replace(drugname, '/', ' '), ' ')
but obviously replaces the slash with a space. And if the slash is at the end of the string like ONEDRUG/OTHERDRUG/
then the result looks like Onedrug Otherdrug 
It's possible that the string may end in a forward slash due to the field only holding N number of characters. When data gets inserted into the table, only the first N characters of the drug name are inserted. If that Nth character is a slash, the string will end in a slash.
CodePudding user response:
You can use a CASE
expression for the separator
parameter to the STRING_SPLIT
function.
In the code below, a common table expression (CTE) uses STRING_SPLIT
to split out all the drugname words and capitalize the first letter of each word as appropriate.
The CTE results are unioned together using STRING_AGG
to join the drugnames back together. Note that separator parameter for STRING_AGG
cannot be an expression. Using a CASE
expression results in this error:
Msg 8733, Level 16, State 1, Line 14 Separator parameter for STRING_AGG must be a string literal or variable.
For SQL Server, you would need to be on SQL 2017 or greater for the STRING_AGG
function. (I added an id/identity column to the #drugnames temp table to assist with grouping.)
DROP TABLE IF EXISTS #exclusionlist;
DROP TABLE IF EXISTS #drugnames;
CREATE TABLE #exclusionlist (word VARCHAR(25))
CREATE TABLE #drugnames (id INT IDENTITY, drugname VARCHAR(50))
INSERT INTO #exclusionlist VALUES ('ER')
INSERT INTO #exclusionlist VALUES ('HCL')
INSERT INTO #drugnames VALUES ('DRUGNAME ER')
INSERT INTO #drugnames VALUES ('drugname hcl')
INSERT INTO #drugnames VALUES ('ONEDRUG/OTHERDRUG')
;WITH SomeStuff AS
(
select d.id, d.drugname,
sp.value AS SplitValue, el.word AS ExclusionListSpaceWord,
COALESCE(el.word,
UPPER(LEFT(sp.value, 1)) LOWER(RIGHT(sp.value, LEN(sp.value) - 1))
) AS CapitalizedWord
from #drugnames d
CROSS APPLY STRING_SPLIT(d.drugname, CASE WHEN d.drugname LIKE '%/%' THEN '/' ELSE ' ' END ) sp
LEFT JOIN #exclusionlist el
ON el.word = sp.value
)
SELECT ss.id, STRING_AGG(ss.CapitalizedWord, '/') AS ReconstructedDrugname
FROM SomeStuff ss
WHERE ss.drugname LIKE '%/%'
GROUP BY ss.id
UNION
SELECT ss.id, STRING_AGG(ss.CapitalizedWord, ' ') AS ReconstructedDrugname
FROM SomeStuff ss
WHERE ss.drugname LIKE '% %'
GROUP BY ss.id
Output:
id ReconstructedDrugname
----------- ----------------------
1 Drugname ER
2 Drugname HCL
3 Onedrug/Otherdrug
CodePudding user response:
If the "/" is replaced to a " /", then the split can still happen on the space.
And the extra space can be removed afterwards.
SELECT [Product Name] = d.drugname , [Product Name 2] = ca.drugname2 FROM #drugnames d CROSS APPLY ( SELECT REPLACE(LTRIM(x),' /','/') AS drugname2 FROM ( SELECT ' ' CASE WHEN e.word IS NOT NULL THEN e.word WHEN s.value LIKE '/%' THEN STUFF(LOWER(s.value),1,2,UPPER(LEFT(s.value,2))) ELSE STUFF(LOWER(s.value),1,1,UPPER(LEFT(s.value,1))) END FROM STRING_SPLIT(REPLACE(d.drugname,'/',' /'),' ') s LEFT JOIN #exclusionlist e ON e.word = s.value FOR XML PATH('') ) q(x) ) ca;
Product Name Product Name 2 DRUGNAME ER Drugname ER drugname hcl Drugname HCL ONEDRUG/OTHERDRUG Onedrug/Otherdrug
Note - using string_split does not, according to the documentation, guarantee the ordering of the values. In practice, I've never seen this be the case and since you're already using the function I'm using here also. There's plenty of ways to split the string while retaining an ordering (using json for example) should it ever prove necessary.
If you are still on 2016 then the string_agg can just be replaced with the for xml implementation you're already using.