Home > other >  REGEX for first and last names involving dots (.)
REGEX for first and last names involving dots (.)

Time:02-01

I want to be able to get the first and last name always starting with a capital letter... This I've already achieved in a post here on stackoverflow, it's this one:

[A-Z][a-z] ([ ][A-Z][a-z] )*

However, according to my business rules, I need to be able to validate names and surnames with only the first letter of the first and last name followed by a period and space, or only by a period if the period is at the end of the string.

For example:

"John Doe" -> true
"John D." -> true
"John D. D." -> true
"John D. D. " -> false (as here we have a space after the last dot in S.)
"John D.  D." -> false (as here two spaces after the first . in B.)
"John D.oe" -> false (as here we have a point not being followed by a space)

In order to get around this situation, I wrote the following code that simply means a dot ( .) followed by a space ( ), however I don't know what else to do and I don't know how to introduce this code there in the REGEX specified above...

([.][\s]?)

The regex I came up with is incomplete and does not produce the result I am seeking for:

^[A-Z](?:[a-z]|[\.]) (?:[ ][A-Z](?:[a-z]|[\.]) )*$

John D.oe -> matches true, however it should not as there is supposed to have a space after every dot...

Does someone out there know how can I solve this issue?

CodePudding user response:

I think this can do the trick:

^(?:\s?[A-Z](?:(?:\.)|(?:\w ?))) ?$

Explanation:

  • We use non capturing groups to group logic (?:)
  • Every group starts with a capital letter and is followed by either a period or a group of lowercase letters \w
  • every group is optionally preceded by a space \s (the space is at the beginning to avoid trailing spaces)
  • The ? at the end means to capture as many groups as it can between the start and the end of the line.
  • If there is something in the line not in a group then the match fails

Because all groups are matched the same this works also for first names with single letters like J. Doe and also works with four names like J. D. D. Doe

Working example: enter image description here

CodePudding user response:

You can attempt to match the regular expression

^[A-Z][a-z]  [A-Z](?:[a-z] |\.(?: [A-Z]\.)?)$

Demo

The expression can be broken down as follows.

^              # match beginning of string
[A-Z]          # match an uppercase letter
[a-z]          # match one or more lowercase letters
[ ][A-Z]       # match a space followed by an uppercase letter
(?:            # begin a non-capture group
  [a-z]        # match one or more lowercase letters
|              # or
  \.           # match a period
  (?:          # begin a non-capture group
    [ ][A-Z]\. # match a space followed by an uppercase letter
               # followed by a period   
  )?           # end inner non-capture group and make it optional
)              # end outer non-capture group
$              # match the end of the string

I have enclosed spaces in character classes ([ ]) above merely to make them visible.

  •  Tags:  
  • Related