I am trying to figure out how to extract a substring from a column in SQL Server. I very much would like to be able to do it directly in SQL if at all possible. The content of my column holds the responses of a web form and looks like this:
"a:27:{s:5:\"FieldX\";s:22:\"Response to FieldX\";s:16:\"WhatProductDoYouWant\";s:31:\"SomeProduct\";s:16:\"FieldY\";s:4:\"Response to FieldY\"}
In the previous example the form has three fields with their respective responses as:
FieldName Response
FieldX Response to FieldX
WhatProductDoYouWant SomeProduct
FieldY Response to FieldY
I need to extract the answer to WhatProductDoYouWant
, that is, I need SomeProduct
.
Constraints:
- I do not know how many fields there are before or after the field I am looking for, it is a dynamic form.
- The answer to the field is dynamic, meaning I do not know how many characters I need to account for, it could be anything.
For a full example, let's say I have the following table in SQL Server table:
CREATE TABLE WebFormData
(
FormID int,
Responses varchar(MAX)
);
INSERT INTO WebFormData (FormID, Responses)
VALUES (1, 'a:27:{s:5:\"FieldX\";s:22:\"Response to FieldX\";s:16:\"WhatProductDoYouWant\";s:31:\"SomeProduct\";s:16:\"FieldY\";s:4:\"Response to FieldY\"}');
INSERT INTO WebFormData (FormID, Responses)
VALUES (2, 'a:27:{s:5:\"FieldX\";s:22:\"Response to FieldX\";a:27:{s:7:\"FieldX2\";s:27:\"Response to FieldX2\";s:16:\"WhatProductDoYouWant\";s:31:\"SomeOtherProduct\";s:16:\"FieldZ\";s:4:\"Response to FieldZ\";s:16:\"FieldY\";s:4:\"Response to FieldY\"}');
I would like to have a SQL query like:
SELECT FormID, someExpression AS Products
FROM WebFormData
And I would expect to have as results:
1, SomeProduct
2, SomeOtherProduct
I have been able to identify the index of the initial character I am looking for but I have no idea how to determine the length of the substring:
SELECT
FormID,
SUBSTRING(Responses, CHARINDEX('WhatProductDoYouWant', Responses) 30, 20) AS Products
FROM
WebFormData
(The 20 in the length parameter of the substring function is just a random number for demonstration purposes)
The query returns:
FormID,Products
1, SomeProduct\";s:16:\
2, SomeOtherProduct\";s
Any help would be appreciated. Please let me know if clarification is required.
CodePudding user response:
You can simply chain cross apply with a values clause to find your start and length positions, and use nullif to handle where the text does not exist:
select Substring(Responses, NullIf(p1,0) 30, p2-(NullIf(p1,0) 30)) Products
from WebFormData
cross apply (values(CHARINDEX('WhatProductDoYouWant', Responses)))x(p1)
cross apply (values(CHARINDEX('\"', Responses, p1 30 )))y(p2);
CodePudding user response:
Perhaps more than you are looking for.
If you want to parse the entire string
Example
;with cte as (
select FormID
,B.RetSeq
,RetVal = replace(B.RetVal,'\"','')
,grp = sum(retSeq % 2) over (partition by FormID order by RetSEQ)
,col = retSeq % 2
From WebFormData A
Cross Apply [dbo].[tvf-Str-Extract-JSON](replace(Responses,'}',';'),':',';') B
)
Select FormID
,Seq = Grp
,Question = max(case when col=1 then RetVal end)
,Response = max(case when col=0 then RetVal end)
From cte
Group By FormID,Grp
Order By FormID,Grp
Results
The Helper Function if Interested
CREATE FUNCTION [dbo].[tvf-Str-Extract-JSON] (@String nvarchar(max),@Delim1 nvarchar(100),@Delim2 nvarchar(100))
Returns Table
As
Return (
Select RetSeq = row_number() over (order by RetSeq)
,RetVal = left(RetVal,charindex(@Delim2,RetVal)-1)
From (
Select RetSeq = [Key] 1
,RetVal = Value
From OpenJSON( N'["' replace(string_escape(@String,'json'),@Delim1,'","') N'"]' )
) C1
Where charindex(@Delim2,RetVal)>1
)