If I know which characters I do not want to keep in my string I can easily remove them with REPLACE
:
SELECT REPLACE(
REPLACE(
REPLACE(
'String with characters like #§$ I do not want to keep',
'#', ''
), '§', ''
), '$', ''
) AS repl_string
--- String with characters like I do not want to keep
But, what if I want to remove every character which is not part of a positive list? With a regex
I would so something like s/[^a-zA-Z0-9 :.]//g
(assuming that I would like to allow only letters, numbers, a space
, a dot .
or a colon :
)
I am looking for a solution for Microsoft SQL Server 2016
CodePudding user response:
This answer uses a very similar solution to the answer I linked above, however, as the OP is using SQL Server 2016, they can't use STRING_AGG
. As such you'll need to use the "old" FOR XML PATH
(and STUFF
) method to reaggregate the string:
CREATE OR ALTER FUNCTION [dbo].[PatternCharacterReplace_XML] (@String varchar(8000), @Pattern varchar(100), @ReplacementCharacter varchar(1))
RETURNS table
AS RETURN
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP(LEN(@String))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4)
SELECT (SELECT CASE WHEN V.C LIKE @Pattern THEN @ReplacementCharacter ELSE V.C END
FROM Tally T
CROSS APPLY (VALUES(SUBSTRING(@String,T.I,1)))V(C)
ORDER BY T.I
FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(8000)') AS ReplacedString;
GO
SELECT *
FROM dbo.PatternCharacterReplace_XML('String with characters like #§$ I do not want to keep','[^A-Za-z0-9 .:]','');
Note that this particular version uses varchar
values. If your value is an nvarchar
I would suggest creating a separate version that uses that data type for the parameters.