Home > Software design >  Regular Expression with specific special characters triggering Arabic
Regular Expression with specific special characters triggering Arabic

Time:09-30

I have a regular expression that I use to detect special characters. As I couldn't figure one to just allow any letters, spaces, - and ','. Like California - USA Cairo, Egypt London UK.

The regular expression I'm using:

'/[!@#$%^&*<>();{}[\]_؟:\ =~\/\?\.\\"\'] /'

There are many \ for escaping regular expression symbols.

However, it works fine with English like New York - USA.

But it matches any Arabic words like القاهرة - مصر محمد

$input = "القاهرة - مصر";

if (preg_match('/[!@#$%^&*<>();{}[\]_؟:\ =~\/\?\.\\"\'] /', $input)) {
    echo 'match';
}

Why is it matching Arabic letters while it only encludes specific characters?

CodePudding user response:

You can use

preg_match('~^[\p{L}\p{M},\s-] \z~u', $input)

See the regex demo. Details:

  • ^ - start of string
  • [\p{L}\p{M},\s-] - one or more letters (\p{L}), diacritics (\p{M}), commas, whitespaces (\s) and hyphens till
  • \z - the very end of string.
  • Related