Home > Net >  Invalid regular expression - Invalid property name in character class
Invalid regular expression - Invalid property name in character class

Time:02-18

I am using a fastify server, containing a typescript file that calls a function, which make sure people won't send unwanted characters. Here is the function :

const SAFE_STRING_REPLACE_REGEXP = /[^\p{Latin}\p{Zs}\p{M}\p{Nd}\-\'\s]/gu;
function secure(text:string) {
  return text.replace(SAFE_STRING_REPLACE_REGEXP, "").trim();
}

But when I try to launch my server, I got an error message : "Invalid regular expression - Invalid property name in character class".

It used to work just fine with my previous regex :

const SAFE_STRING_REPLACE_REGEXP = /[^0-9a-zA-ZàáâäãåąčćęèéêëėįìíîïłńòóôöõøùúûüųūÿýżźñçčšžÀÁÂÄÃÅĄĆČĖĘÈÉÊËÌÍÎÏĮŁŃÒÓÔÖÕØÙÚÛÜŲŪŸÝŻŹÑßÇŒÆČŠŽ∂ð\-\s\']/g;
function secure(text:string) {
  return text.replace(SAFE_STRING_REPLACE_REGEXP, "").trim();
}

But I have been told it wasn't optimized enough. I have also been told it's better to use split/join than regex/replace in matter of performances, but I don't know if I can use it in my case.

CodePudding user response:

You need to use

const SAFE_STRING_REPLACE_REGEXP = /[^\p{Script=Latin}\p{Zs}\p{M}\p{Nd}'\s-]/gu;
// or
const SAFE_STRING_REPLACE_REGEXP = /[^\p{sc=Latin}\p{Zs}\p{M}\p{Nd}'\s-]/gu;

You need to prefix scripts with sc= or Script= in Unicode category classes, so \p{Latin} should be specified as \p{Script=Latin}. See the ECMAScript reference.

Also, when you use the u flag, you cannot escape non-special chars, so do not escape ' and better move the - char to the end of the character class.

You can use split&join, too:

const SAFE_STRING_REPLACE_REGEXP = /[^\p{Script=Latin}\p{Zs}\p{M}\p{Nd}'\s-]/u;
console.log("Ącki-Łał русский!!!中国".split(SAFE_STRING_REPLACE_REGEXP).join(""))

Note you don't need the g modifier with split, it is the default behavior.

  • Related