I am using the following SQL to create a function to strip out characters or numbers.
CREATE FUNCTION [dbo].[fn_StripCharacters]
(
@String NVARCHAR(MAX),
@MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
SET @MatchExpression = '%[' @MatchExpression ']%'
WHILE PatIndex(@MatchExpression, @String) > 0
SET @String = Stuff(@String, PatIndex(@MatchExpression, @String), 1, '')
RETURN @String
END
I am calling it like this to extract the alpha characters and then the numeric characters into two sort fields:
SELECT
...
(SELECT dbo.fn_StripCharacters(PD.District, '^a-z')) AS Sort1,
CAST((SELECT dbo.fn_StripCharacters(PD.District, '^0-9')) AS INT) AS Sort2,
...
FROM
I am searching a LOT of Postcode District Records and the constant calls to the function are causing a comparatively big delay. Is there a way of reproducing this functionality without a function call? Can it be incorporated into the SELECT statement somehow?
SQL Server Version is 2017
CodePudding user response:
You can try something like this:
Create Function [dbo].[fnCleanString] (
@inputString varchar(8000)
, @stringPattern varchar(50) = '[0-9a-zA-Z]'
)
Returns table
With schemabinding
As
Return
With t(n)
As (
Select t.n
From (
Values (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)
, (0), (0), (0), (0), (0), (0), (0), (0), (0), (0)) As t(n)
)
, iTally (n)
As (
Select Top (len(@inputString))
checksum(row_number() over(Order By @@spid))
From t t1, t t2, t t3 -- 8000 rows
)
Select v.inputString
, outputString = (Select substring(v.inputString, it.n, 1)
From iTally it
Where substring(v.inputString, it.n, 1) Like @stringPattern
For xml Path(''), Type).value('.', 'varchar(8000)')
From (Values (@inputString)) As v(inputString);
GO
And call it like this:
Declare @testData table (AlphaNumeric varchar(100));
Insert Into @testData (AlphaNumeric)
Values ('a1b2c3d4'), ('5e6f7g8i');
Select *
From @testData td
Cross Apply dbo.fnCleanString(td.AlphaNumeric, '[^A-Z]') cs
Cross Apply dbo.fnCleanString(td.AlphaNumeric, '[^0-9]') cs2;
This works with VARCHAR - but you can easily change it to return NVARCHAR if that is what you really need. Just be aware that it is written to work with a fixed length string and the maximum size for NVARCHAR is 4000.
You also need to be aware of collation and how that will affect the results.
CodePudding user response:
How to extract alpha or numbers from a string within a SELECT statement WITHOUT a function call?
You can't, because simple string operations like REPLACE()
are also function calls. And, user-defined functions are compiled by SQL Server. They perform pretty well.
But your problem is not the function's performance itself, but rather how often you must use it. You knew that.
Here's a possible way to speed up your postcode-grinding task: put persisted computed columns on your table. You can even index them.
Here's how to do that.
Tell SQL Server to use schemabinding with your stored function. It needs to know that a table definition in your schema depends on the function. To do that add
WITH SCHEMABINDING
to your function definition.... RETURNS NVARCHAR(MAX) WITH SCHEMABINDING AS BEGIN ...
Add two computed, persisted, columns to your table.
... ALTER TABLE postcode ADD letters AS (dbo.fn_StripCharacters(postcode, '^A-Z')) PERSISTED; ALTER TABLE dbo.postcode ADD numbers AS (CAST(dbo.fn_StripCharacters(postcode, '^0-9') AS INT)) PERSISTED;
You can put indexes on the computed columns if you need them.
CREATE INDEX numbers ON postcode (numbers DESC)
Now you can insert, update, or delete your non-computed columns as you wish. SQL Server evaluates your stored functions just once for each row as it is inserted or updated. Your functions still get evaluated, but not when you SELECT from your table.
And you can use the computed columns as you wish
SELECT * FROM postcode ORDER BY numbers DESC
Here's a db<>fiddle demonstrating this.