Home > Software design >  SQL Server - Regex pattern match only alphanumeric characters
SQL Server - Regex pattern match only alphanumeric characters

Time:05-06

I have an nvarchar(50) column myCol with values like these 16-digit, alphanumeric values, starting with '0': 0b00d60b8d6cfb19, 0b00d60b8d6cfb05, 0b00d60b8d57a2b9

I am trying to delete rows with myCol values that don't match those 3 criteria.

By following this article, I was able to select the records starting with '0'. However, despite the [a-z0-9] part of the regex, it also keeps selecting myCol values containing special characters like 00-d@!b8-d6/f&#b. Below is my select query:

SELECT * from Table
WHERE myCol LIKE '[0][a-z0-9]% AND LEN(myCol) = 16'

How should the expression be changed to select only rows with myCol values that don't contain special characters?

CodePudding user response:

If the value must only contain a-z and digits, and must start with a 0 you could use the following:

SELECT *
FROM (VALUES(N'0b00d60b8d6cfb19'),
            (N'0b00d60b8d6cfb05'),
            (N'0b00d60b8d57a2b9'),
            (N'00-d@!b8-d6/f&#b'))V(myCol)
WHERE V.myCol LIKE '0%' --Checks starts with a 0
  AND V.myCol NOT LIKE '%[^0-9A-z]%' --Checks only contains alphanumerical characters
  AND LEN(V.myCol) = 16;

The second clause works as the LIKE will match any character that isn't an alphanumerical character. The NOT then (obviously) reverses that, meaning that the expression only resolves to TRUE when the value only contains alphanumerical characters.

CodePudding user response:

Pattern matching in SQL Server is not awesome, and there is currently no real regex support.

The % in your pattern is what is including the special characters you show in your example. The [a-z0-9] is only matching a single character. If your character lengths are 16 and you're only interested in letters and numbers then you can include a pattern for each one:

SELECT *
FROM Table
WHERE myCol LIKE '[0][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]';

Note: you don't need the AND LEN(myCol) = 16 with this.

  • Related