Home > Software engineering >  Separate text by delimiter SQL Server
Separate text by delimiter SQL Server

Time:02-22

I need some help to separate the text by the delimiter /, and get the 3rd argument, I know that with the functions CHARINDEX, SUBSTRING, CASE, THEN, WHEN... it's possible, but I'm having difficulty.

Database details:

AE/SREW/SWERR/REEEM/RRTS
AI/SREW/RUYE/REMC/DOR
AR/SRTE/RIRRH/PROTRED
AE/ASDEE/RIRRH/PROTRED/SEDRR
AE/FGY/RH/VETSDEE/AIURD
TA/PA/RHNEW/SERJJ/CENTER/FERT
FJHDE
FJHDE/YREY

I ran the following query, but got the following result:

SELECT
    SUBSTRING(values, CHARINDEX('/', values, (CHARINDEX('/', values)   1))   1, LEN(values)) AS Result
FROM
    dbtable

Result:

SWERR/REEEM/RRTS
RUYE/REMC/DOR
RIRRH/PROTRED
RIRRH/PROTRED/SEDRR
HR/VETSDEE/AIURD
RHNEW/SERJJ/CENTER/FERT
FJHDE
FJHDE/YREY

Desired result:

ANSWER
RUYE
RIRRH
RIRRH
HR
RHNEW
FJHDE -> For cases where there is no "/", put the text 'OTHER'.
FJHDE/YREY -> For cases where there is "/" only one (1), "/", put the text 'OTHER'.

Best regards.

CodePudding user response:

You can transform those strings into the format of a JSON array.
Then pick the 2th key from that. (the key starts at 0)

SELECT ISNULL(a.value, 'OTHER') AS Result
FROM dbtable t
OUTER APPLY (
  SELECT j.value
  FROM OPENJSON('["' REPLACE(t.[values],'/', '","') '"]') j
  WHERE j.[key] = 2
) a;
Result
SWERR
RUYE
RIRRH
RIRRH
RH
RHNEW
OTHER
OTHER

Test on db<>fiddle here

CodePudding user response:

You're passing the wrong length to the SUBSTRING function, which is why you're getting the result from 2nd / up until the end. I broke it down with CROSS APPLY for 2nd and 3rd /:

SELECT vals,
    split2.idx, split3.idx,
    CASE 
      WHEN split2.idx = 0 THEN 'OTHER'
      ELSE SUBSTRING(vals, split2.idx 1, split3.idx - split2.idx - 1)
    END as third
FROM
    tb
CROSS 
APPLY (SELECT CHARINDEX('/', vals, CHARINDEX('/', vals) 1) as idx) split2
CROSS
APPLY (SELECT CHARINDEX('/', vals, split2.idx 1) as idx) split3

Alternatively, if you're on Azure SQL Database or using SQL Server 2022, you can simply use STRING_SPLIT as described in the documentation, with value and ordinal returned from the function.

If you're on SQL Server 2016 or 2019, with STRING_SPLIT but without the ordinal, you need sort of a hack using ROW_NUMBER() OVER (ORDER BY (SELECT 0)) as the function doesn't really tell you which string is at which index:

SELECT tb.vals, ISNULL(MAX(IIF(ordinal = 3, value, NULL)), 'OTHER')
  FROM tb
 CROSS
 APPLY (SELECT value
             , ROW_NUMBER() OVER (ORDER BY (SELECT 0)) as ordinal 
          FROM STRING_SPLIT(tb.vals, '/')
     ) split
 GROUP BY tb.vals

The way this works is uses the STRING_SPLIT function to split each of the values into rows containing the individual substrings. ROW_NUMBER() ordered by a literal is used to then give the index, although no actual guarantee is ever made by Microsoft that the substrings will be returned from the function in order of occurrence, that's why it's a hack. Then, all of these rows are aggregated and, in the MAX aggregation function we select the 3rd occurrence. If no such occurence is found, the IIF statement will return all NULLs and MAX in turn will also be NULL, which we then substitute for 'OTHER', as per requirements.

Working demo on dbfiddle for both solutions

  • Related