I have a stored procedure it takes comma separated string as input. Which might be too large some times approximately more than 8 thousand characters or more. In that situation, query performance goes down sometimes. And I think there is a limitation for the character length inside the IN
clause. For that, sometimes I get errors. Now, I need to know is it better to use a Custom TABLE TYPE as parameter and use Inner JOIN
to find the result. If it is then why is it. Here are my 2 stored procedures (minimal code):
CREATE TYPE [dbo].[INTList] AS TABLE(
[ID] [int] NULL
)
Procedure 1
CREATE PROCEDURE [report].[GetSKU]
@list [INTList] READONLY,
AS
Select sk.SKUID,sk.Code SCode,sk.SName
FROM SKUs sk
INNER JOIN @list sst ON sst.ID=sk.SKUID
Procedure 2
CREATE PROCEDURE [report].[GetSKU]
@params varchar(max),
AS
Select sk.SKUID,sk.Code SCode,sk.SName
FROM SKUs sk
WHere CHARINDEX(',' cast( sk.SKUID as varchar(MAX)) ',', @params) > 0
Now, which procedures is better to use.
Note: Original Stored Procedures does have few more Joins
.
CodePudding user response:
As this question did raise quite some discussion in comments but did not get any viable answer, I'd like to add the major points in order to help future research.
This question is about: How do I pass a (large) list of values into a query?
In most cases, people need this either in a WHERE SomeColumn IN(SomeValueList)
-filter or to JOIN
against this with something like FROM MyTable INNER JOIN SomeValueList ON...
.
Very important is the SQL-Server's version, as with v2016 we got two great tools: native STRING_SPLIT()
(not position-safe!) and JSON support.
Furthermore, and rather obvious, we have to think about the scales and values.
- Do we pass in a simple list of some IDs or a huge list with thousands of values?
- Do we talk about simple integers or GUIDs?
- And what's about text values, where we have to think about dangerous characters (like
[ { "
in JSON or< &
in XML - there are many more...)? - What about CSV-lists, where the separating character might appear within the content (quoting / escaping)?
- In some cases we might even want to pass several columns at once...
There are several options:
- Table valued parameter (TVP,
CREATE TYPE ...
), - CSV together with string splitting functions (native since v2016, various home brewed, CLR...),
- and text-based containers: XML or JSON (since v2016)
Table valued paramter (TVP - the best choice)
A table valued parameter (TVP) must be created in advance (this might be a draw back) but will behave as any other table once created. You can add indexes, you can use it in various use cases and you do not have to bother about anything under the hood.
Sometimes we cannot use this due to missing rights to use CREATE TYPE
...
Character separated values (CSV)
With CSV we see three approaches
Dynamic Sql: Create a statement, where the CSV list is simply stuffed into the
IN()
and execute this dynamically. This can be a very efficient approach, but will be open to various obstacles (no ad-hoc-usage, injection threat, breaking on bad values...)String splitting functions: There are tons of examples around... All of them have in common that the separated string will be returned as a list of items. Common issues here: performance, missing ordinal position, limits for the separator, handling of duplicate or empty values, handling of quoted or escaped values, handling of separators within the content. Aaron Bertrand did some great research about the various approaches of string splitting. Similar to TVPs one draw back might be, that this function must exist in the database in advance or that we need to be allowed to execute
CREATE FUNCTION
if not.ad-hoc-splitters: Before v2016 the most used approach was XML based, since then we have moved to JSON based splitters. Both use some string methods to transform the CSV string to 1) separated elements (XML) or 2) into a JSON-array. The result is queried by 1) XQuery (
.value()
and.nodes()
) or 2) JSON'sOPENJSON()
orJSON_VALUE()
.
Text based containers
We can pass the list as string, but within a defined format:
- Using
["a","b","c"]
instead ofa,b,c
allows for immediate usage ofOPENJSON()
. - Using
<x>a</x><x>b</x><x>c</x>
instead allows for XML queries.
The biggest advantage here: Any programming language provides support for these formats.
Common obstacles like date and number formatting is solved implicitly. Passing JSON or XML is - in most cases - just some few lines of code.
Both approaches allow for type- and position-safe queries.
We can solve our needs without the need to rely on anything existing in advance.
CodePudding user response:
For the very best performance you can use this function:
CREATE FUNCTION [dbo].StringSplit
(
@String VARCHAR(MAX), @Separator CHAR(1)
)
RETURNS @RESULT TABLE(Value VARCHAR(MAX))
AS
BEGIN
DECLARE @SeparatorPosition INT = CHARINDEX(@Separator, @String ),
@Value VARCHAR(MAX), @StartPosition INT = 1
IF @SeparatorPosition = 0
BEGIN
INSERT INTO @RESULT VALUES(@String)
RETURN
END
SET @String = @String @Separator
WHILE @SeparatorPosition > 0
BEGIN
SET @Value = SUBSTRING(@String , @StartPosition, @SeparatorPosition- @StartPosition)
IF( @Value <> '' )
INSERT INTO @RESULT VALUES(@Value)
SET @StartPosition = @SeparatorPosition 1
SET @SeparatorPosition = CHARINDEX(@Separator, @String , @StartPosition)
END
RETURN
END
This function return table - select * from StringSplit('12,13,14,15,16', ',')
so you can join this function to your table or can use IN
on the where clause.