Home > Software design >  Find text in between 2 strings
Find text in between 2 strings

Time:04-30

I have the following code which works for finding text in between 2 strings. I need help with finding text between multiple occurrences. In this case, I am trying to find text between start and end. In this case, it returns one, two, three properly.

 DECLARE @text AS VARCHAR(MAX) = 'this is for first <start> one, two, three <end>.';
 DECLARE @pretext AS nvarchar(100) = '<start>'    
 DECLARE @posttext AS nvarchar(100) = '<end>'

 SELECT 
  CASE 
     WHEN CHARINDEX(@posttext, @Text) - (CHARINDEX(@pretext, @Text)   len(@pretext)) < 0 
        THEN ''
        ELSE SUBSTRING(@Text, 
                       CHARINDEX(@pretext, @Text)   LEN(@pretext),   
                       CHARINDEX(@posttext, @Text) - (CHARINDEX(@pretext, @Text)   
 LEN(@pretext)))    
 END AS textinbetween   

Next, if I have a string such as the following:

  DECLARE @text AS VARCHAR(MAX) = 'this <end><end>is for first <start> one, two, three <end>.this is for second <start> four, five, six<end>.';

Essentially, I need all the text in between that are between and In this case there are 2: one, two, three and four, five, six. I can have any x amount of so just need to get that text between all of them.

Thank you in advance

CodePudding user response:

This is ugly stuff to do in T-SQL because of the "false" starts and ends. A CTE perhaps. I updated the test string to include a false start. If the false start and end tokens are to be excluded in the results, they can be replaced after the final substring.

DECLARE @text AS VARCHAR(MAX) = 'this <end><end>is for first <start> one, two, three <end>.this is for second <start> four, five, <start>six<start><end>.';
DECLARE @pretext as VARCHAR(128) = '<start>'
DECLARE @posttext as VARCHAR(128) = '<end>';

WITH Tokens as (
    SELECT CHARINDEX(@pretext, @Text, 1) as [startIndex], 
        CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, 1)) as [stopIndex],
        1 as [TokenNumber]
    WHERE CHARINDEX(@pretext, @Text, 1) > 0 
        and CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, 1)) > 0
    UNION ALL 
    SELECT CHARINDEX(@pretext, @Text, t.[stopIndex]) as [startIndex], 
        CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, CHARINDEX(@pretext, @Text, t.[stopIndex]))) as [stopIndex],
        t.TokenNumber   1
    FROM Tokens t
    WHERE CHARINDEX(@pretext, @Text, t.[stopIndex]) > 0 
        AND CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, CHARINDEX(@pretext, @Text, t.[stopIndex]))) > 0
) SELECT *, 
    SUBSTRING(@Text, [startIndex]   LEN(@pretext), [stopIndex] - [startIndex] - LEN(@pretext)) 
    FROM Tokens

CodePudding user response:

Assuming it is ok to have each value in its own row string_split is a simple option. If you need this to be in a single row there are thousands of examples of how to pivot or use stuff to get these back to a single row.

declare @asdf varchar(100) = 'This is a test <one>first string<two> and this is the second <one>second string<two>.'

select left(x.value, charindex('<two>', x.value) - 1)
from string_split(replace(@asdf, '<one>', '^'), '^') x --The delimiter is limited to a single character so just used replace to switch it for a character not in your string.
where charindex('<two>', x.value) > 0
  • Related