I have the following code which works for finding text in between 2 strings. I need help with finding text between multiple occurrences. In this case, I am trying to find text between start and end. In this case, it returns one, two, three properly.
DECLARE @text AS VARCHAR(MAX) = 'this is for first <start> one, two, three <end>.';
DECLARE @pretext AS nvarchar(100) = '<start>'
DECLARE @posttext AS nvarchar(100) = '<end>'
SELECT
CASE
WHEN CHARINDEX(@posttext, @Text) - (CHARINDEX(@pretext, @Text) len(@pretext)) < 0
THEN ''
ELSE SUBSTRING(@Text,
CHARINDEX(@pretext, @Text) LEN(@pretext),
CHARINDEX(@posttext, @Text) - (CHARINDEX(@pretext, @Text)
LEN(@pretext)))
END AS textinbetween
Next, if I have a string such as the following:
DECLARE @text AS VARCHAR(MAX) = 'this <end><end>is for first <start> one, two, three <end>.this is for second <start> four, five, six<end>.';
Essentially, I need all the text in between that are between and In this case there are 2: one, two, three and four, five, six. I can have any x amount of so just need to get that text between all of them.
Thank you in advance
CodePudding user response:
This is ugly stuff to do in T-SQL because of the "false" starts and ends. A CTE perhaps. I updated the test string to include a false start. If the false start and end tokens are to be excluded in the results, they can be replaced after the final substring.
DECLARE @text AS VARCHAR(MAX) = 'this <end><end>is for first <start> one, two, three <end>.this is for second <start> four, five, <start>six<start><end>.';
DECLARE @pretext as VARCHAR(128) = '<start>'
DECLARE @posttext as VARCHAR(128) = '<end>';
WITH Tokens as (
SELECT CHARINDEX(@pretext, @Text, 1) as [startIndex],
CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, 1)) as [stopIndex],
1 as [TokenNumber]
WHERE CHARINDEX(@pretext, @Text, 1) > 0
and CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, 1)) > 0
UNION ALL
SELECT CHARINDEX(@pretext, @Text, t.[stopIndex]) as [startIndex],
CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, CHARINDEX(@pretext, @Text, t.[stopIndex]))) as [stopIndex],
t.TokenNumber 1
FROM Tokens t
WHERE CHARINDEX(@pretext, @Text, t.[stopIndex]) > 0
AND CHARINDEX(@posttext, @Text, CHARINDEX(@pretext, @Text, CHARINDEX(@pretext, @Text, t.[stopIndex]))) > 0
) SELECT *,
SUBSTRING(@Text, [startIndex] LEN(@pretext), [stopIndex] - [startIndex] - LEN(@pretext))
FROM Tokens
CodePudding user response:
Assuming it is ok to have each value in its own row string_split is a simple option. If you need this to be in a single row there are thousands of examples of how to pivot or use stuff to get these back to a single row.
declare @asdf varchar(100) = 'This is a test <one>first string<two> and this is the second <one>second string<two>.'
select left(x.value, charindex('<two>', x.value) - 1)
from string_split(replace(@asdf, '<one>', '^'), '^') x --The delimiter is limited to a single character so just used replace to switch it for a character not in your string.
where charindex('<two>', x.value) > 0