Home > database >  Regex IsMatch all Letters except 'X', for example
Regex IsMatch all Letters except 'X', for example

Time:01-04

How to do something like this with Regex? :

string myString = "smth there";
foreach (char c in myString)
{
    if (char.IsLetter(c) && !c.Equals('X')) return false;
}

I have tried this:

if (Regex.IsMatch(number, @"[A-Za-z^X]")) return false;

Because [^X] is "All except X" and as expected - nothing worked, as always.

CodePudding user response:

You can't negate only a portion of a character class using a caret. You must either negate the entire class (by having ^ as the first character in the class definition), or none of it. You could use the regex [A-WYZ] with the IgnoreCase option. That character class matches only the characters A-W, and Y and Z.

Or, you could use character class subtraction. The regex [A-Z-[X]] will match all characters from A to Z, except X.

To match the full string instead of looping over characters, you could do ^[A-Z-[X]] $. Outside of [], a caret (^) matches the start of a string, and a $ matches the end of a string (or line if the Multiline option is specified)

Note that this won't match your string myString = "smth there";, because you've excluded spaces.

Regex.IsMatch("smththere", @"^[A-Z-[X]] $", RegexOptions.IgnoreCase); // true

Regex.IsMatch("smth there", @"^[A-Z-[X]] $", RegexOptions.IgnoreCase); // false (because of the space)

Regex.IsMatch("xinthisstring", @"^[A-Z-[X]] $", RegexOptions.IgnoreCase); // false because x

Alexei makes a very good point below:

Character class is easy - @"^[\w-[X]] $", it is A-Z that is tricky

You can exclude the character 'X' from your character class with -[X] in the class, but replicating the behavior of char.isLetter is tricky, since char.isLetter is true for all unicode letters in addition to ASCII letters. You could explicitly specify that your character class contains all unicode letters that char.isLetter is true for, but that would be tedious. It's much easier to just use isLetter and c != 'x' if you need to deal with all unicode letters. See https://learn.microsoft.com/en-us/dotnet/api/system.char.isletter?view=net-7.0#remarks

CodePudding user response:

Negations (^) can only be listed first, to apply to the whole character class. Additionally, there is no way to match the same source character on multiple distinct character classes.

But you could do something like this:

[A-WYZa-wyz]

The weakness here is it also excludes some common accented characters or other items you might actually want.

You could also try something like this:

[^xX0-9\s;:'"!@#$%^&*()_`~,.<>?= \/\[\]{}\\\-]

But I don't recommend it... unicode is just too big.

Personally, I'd keep this as two separate checks.

  •  Tags:  
  • c#
  • Related