Home > Software engineering >  T-SQL - How to pattern match for a list of values?
T-SQL - How to pattern match for a list of values?

Time:07-20

I'm trying to find the most efficient way to do some pattern validation in T-SQL and struggling with how to check against a list of values. This example works:

SELECT *
FROM SomeTable
WHERE Code LIKE '[0-9]JAN[0-9][0-9]'
OR Code LIKE '[0-9]FEB[0-9][0-9]'
OR Code LIKE '[0-9]MAR[0-9][0-9]'
OR Code LIKE '[0-9]APRIL[0-9][0-9]

but I am stuck on wondering if there is a syntax that will support a list of possible values within the single like statement, something like this (which does not work)

SELECT *
FROM SomeTable
WHERE Code LIKE '[0-9][JAN, FEB, MAR, APRIL][0-9][0-9]'

I know I can leverage charindex, patindex, etc., just wondering if there is a simpler supported syntax for a list of possible values or some way to nest an IN statement within the LIKE. thanks!

CodePudding user response:

I think the closest you'll be able to get is with a table value constructor, like this:

SELECT *
FROM SomeTable st
INNER JOIN (VALUES 
    ('[0-9]JAN[0-9][0-9]'),
    ('[0-9]FEB[0-9][0-9]'),
    ('[0-9]MAR[0-9][0-9]'),
    ('[0-9]APRIL[0-9][0-9]')) As p(Pattern) ON st.Code LIKE p.Pattern

This is still less typing and slightly more efficient than the OR option, if not as brief as we hoped for. If you knew the month was always three characters we could do a little better:

Code LIKE '[0-9]___[0-9][0-9]'

Unfortunately, I'm not aware of SQL Server pattern character for "0 or 1" characters. But maybe if you want ALL months we can use this much to reduce our match:

SELECT *
FROM SomeTable
WHERE (Code LIKE '[0-9]___[0-9][0-9]'
    OR Code LIKE '[0-9]____[0-9][0-9]'
    OR Code LIKE '[0-9]_____[0-9][0-9]')

You'll want to test this to check if the data might contain false positive matches, and of course the table-value constructor could use this strategy, too. Also, I really hope you're not storing dates in a varchar column, which is a broken schema design.

One final option you might have is building the pattern on the fly. Something like this:

Code LIKE '[0-9]'   'JAN'   '[0-9][0-9]'

But how you find that middle portion is up to you.

CodePudding user response:

The native TSQL string functions don't support anything like that.

But you can use a workaround (dbfiddle) such as

WHERE  CASE WHEN Code LIKE '[0-9]%[^ ][0-9][0-9]' THEN SUBSTRING(Code, 2, LEN(Code) - 3) END 
       IN 
       ( 'JAN', 'FEB', 'MAR', 'APRIL' ) 

So first of all check that the string starts with a digit and ends in a non-space character followed by two digits and then check the remainder of the string (not matched by the digit check) is one of the values you want.

The reason for including the SUBSTRING inside the CASE is so that is only evaluated on strings that pass the LIKE check to avoid possible "Invalid length parameter passed to the LEFT or SUBSTRING function." errors if it was to be evaluated on a shorter string.

  • Related