Home > Enterprise >  How can I change my regex to eliminate consecutive identical special characters?
How can I change my regex to eliminate consecutive identical special characters?

Time:11-25

My regex works on validating first and last names. The acceptable forms are as follows:

  • Jacob Wellman
  • Wellman, Jacob
  • Wellman, Jacob Wayne
  • O’Shaughnessy, Jake L.
  • John O’Shaughnessy-Smith
  • Kim

The unacceptable forms are as follows:

  • Timmy O’’Shaughnessy
  • John O’Shaughnessy--Smith
  • K3vin Malone
  • alert(“Hello”)
  • select * from users;

My current regex is as follows.

^[\w'\-,.][^0-9_!¡?÷?¿\\ =@#$%ˆ&*(){}|~<>;:[\]]{2,}$

It works properly for validating all of the names except for:

  • Timmy O’’Shaughnessy
  • John O’Shaughnessy--Smith

The reason for this is that the regex doesn't take into account consecutive identical special characters. How can I change my regex to take those into account?

CodePudding user response:

You can exclude consecutive characters by using a negative lookahead with a backreference to assert not a character directly followed by the same character ^(?!.*([’-])\1

Note that your current pattern matches names that are at least 3 letter long, and will not match for example names like Al

If you want to match that as well, you can change {2,} to in the pattern.

^(?!.*([’-])\1)[\w',.-][^\n\r0-9_!¡?÷¿\\ =@#$%ˆ&*(){}|~<>;:[\]]{2,}$

Regex demo

Matching names can be difficult, this page has an interesting read about names:

Falsehoods Programmers Believe About Names

CodePudding user response:

^(:?[^0-9'\-\., _!¡?÷?¿\\ =@#$%ˆ&*(){}|~<>;:[\]] (:?['-]|, | |\.|\. |$)) $

I used your forbidden characters set and added '\-\., . Then I let them repeat . I insert a group of allowed divisors: (:?['-]|, | |\.|\. |$) and allow repeating this pattern .
I tried it here.

CodePudding user response:

You could do it separately, before your validation. With a Perl regex, to remove additional special characters, it would be:

s/(\W)\1 /$1/g

so for example:

$ echo "John O’’Shaughnessy--Smith" | perl -C -pe 's/(\W)\1 /$1/g'
John O’Shaughnessy-Smith
  • Related