I have a nvarchar column in table that stores data as '{CRLF}' separated values. Those values can be in the format 'namespace#name$value' but also in other formats.
I need to extract separated namespace, name and value and skip data in other formats. I can do it with a query:
select
SUBSTRING([value], 1, CHARINDEX('#',[value]) - 1) as paramNamespace,
SUBSTRING([value], CHARINDEX('#',[value]) 1, CHARINDEX('$',[value]) - CHARINDEX('#',[value]) - 1) as paramName,
SUBSTRING([value], CHARINDEX('$',[value]) 1, LEN([value])) as paramValue
FROM STRING_SPLIT(REPLACE((select data from test where Id = 1), '{CRLF}', CHAR(7)), CHAR(7))
WHERE [value] LIKE '%#%$%'
However, when I try also to filter in by paramName with this query:
select * from (select
SUBSTRING([value], 1, CHARINDEX('#',[value]) - 1) as paramNamespace,
SUBSTRING([value], CHARINDEX('#',[value]) 1, CHARINDEX('$',[value]) - CHARINDEX('#',[value]) - 1) as paramName,
SUBSTRING([value], CHARINDEX('$',[value]) 1, LEN([value])) as paramValue
FROM STRING_SPLIT(REPLACE((select data from test where Id = 1), '{CRLF}', CHAR(7)), CHAR(7))
WHERE [value] LIKE '%#%$%') a
where paramName IN ('name', 'name3')
it returns
Invalid length parameter passed to the LEFT or SUBSTRING function.
It looks like "WHERE [value] LIKE '%#%$%'" weren't executed first and it tries to substring value in other format.
How should I write this query to filter it by paramName?
Schema:
CREATE TABLE test (Id INT, data nvarchar(max));
INSERT INTO test VALUES (1, N'namespace#name$value{CRLF}somedatainotherformat{CRLF}namespace2#name2$value2{CRLF}namespace3#name3$value3{CRLF}');
Fiddle: http://sqlfiddle.com/#!18/e5d71/1
CodePudding user response:
Perhaps PARSENAME
to split the data into columns is what you are after:
CREATE TABLE test (Id int,
data nvarchar(MAX));
INSERT INTO test
VALUES (1, N'namespace#name$value{CRLF}somedatainotherformat{CRLF}namespace2#name2$value2{CRLF}namespace3#name3$value3{CRLF}');
GO
SELECT *
FROM dbo.test t
CROSS APPLY STRING_SPLIT(REPLACE(t.data,'{CRLF}','|'),'|') SS
CROSS APPLY (VALUES(PARSENAME(TRANSLATE(SS.[value],'$#','..'),3),PARSENAME(TRANSLATE(SS.[value],'$#','..'),2),PARSENAME(TRANSLATE(SS.[value],'$#','..'),1)))V(paramNamespace, paramName, paramValue)
WHERE V.paramName IN ('name', 'name3')
GO
DROP TABLE test;
CodePudding user response:
Here is one option:
Select paramNamespace = substring(v.data, 1, p1.pos - 2)
, paramName = substring(v.data, p1.pos, p2.pos - p1.pos - 1)
, paramValue = substring(v.data, p2.pos, p3.pos - p2.pos - 1)
From test As t
Cross Apply string_split(replace(t.data, '{CRLF}', '|'), '|') As s
Cross Apply (Values (concat(s.value, '#$'))) As v(data)
Cross Apply (Values (charindex('#', v.data, 1) 1)) As p1(pos)
Cross Apply (Values (charindex('$', v.data, p1.pos) 1)) As p2(pos)
Cross Apply (Values (charindex('#', v.data, p2.pos) 1)) As p3(pos) --end of set
Where p2.pos - p1.pos > 1;
We concat the delimiters to the end - and then filter out any rows where the difference between the delimiters is greater than 1. So any rows that have only '#$' or rows that start with '#$' or rows that have some other value that does not include either of the delimiters.
If you have data that can have either delimiter - but not the other, you need to filter those out some other way.