Home > Net >  regex for matching lastName, firstName, middleName
regex for matching lastName, firstName, middleName

Time:10-01

I have a regex below to find lastName, firstName, middleName with dots, without dots, with spaces, without spaces etc. How to improve my regex, to match all my examples without issues?

   [А-Я] [а-я]*\s [А-Я]\.*[а-я]*\.*\s*[А-Я]*[а-я]*\.*\,*

Issues are highlighting in in enter image description here

Here you can see the first name in green, middle [if any] in blue, and surname in orange and it does this solely based on these assumptions:

  • the first name is a capital letter, followed by lowercase letters, and separated from further names by a single space
  • there are one or two names following this first name
  • these later names may either take the form of the first name, or be a single capital letter followed by a space, a period, or another name
  • the end of the name is only recognisable at the end of a line, some other word (something beginning with a lowercase letter), or a non-word non-whitespace character

But outside of a toy for learning, or perhaps a highlighting aid for human reading, it would never be perfect, for that you would need actual language parsers; something that understands not names, but all the other words and the syntax between them.

  • Related