Home > database >  How do I create a regex to avoid a repeated number with optional hyphen?
How do I create a regex to avoid a repeated number with optional hyphen?

Time:12-22

I've been stuck on this problem for days. I need a regex to validate an Id of this format:

^[0-9]{2}[-]{0,1}[0-9]{7}$

But from this pattern I have to exclude sets like:

  • 00-0000000 and 000000000
  • 11-1111111 and 111111111
  • 22-2222222 and 222222222
  • ...
  • 99-9999999 and 999999999

Note that 22-2222221 is valid.

Can anyone help me? If the Id is written without the dash, I could use ^(\d)(?!\1{8})\d{8}$ but what if the hyphen is present?

Tried something like

^(\d)(?!\1{8})\d{8}$|^(\d)(?!\1{2})\d{1}-(?!\1{7})\d{7}$

but the second part does not seem to work properly.

Thanks to Wiktor for the solution. How would you adapt it also for pattern below?

^[0-9]{3}[-]{0,1}[0-9]{2}[-]{0,1}[0-9]{4}$

Probably something like this can work: ^(?!(\d)(?:-?\1)*$)\d{3}-?\d{2}-?\d{4}$

CodePudding user response:

You can use

^(?!(\d)(?:-?\1)*$)\d{2}-?\d{7}$

See the regex demo.

Details:

  • ^ - start of string
  • (?!(\d)(?:-?\1)*$) - a negative lookahead that fails the match if there is
    • (\d) - a digit (captured in Group 1)
    • (?:-?\1)* - zero or more sequences of an optional - and then the digit that was captured in Group 2
    • $ - end of string, immediately to the right of the current location
  • \d{2}-?\d{7} - two digits, an optional -, and then seven digits
  • $ - end of string.

CodePudding user response:

To create a regular expression (regex) that matches a string that does not contain a repeated number with an optional hyphen, you can use the following regex:

^(?!.(\d)\1)[\d-]$

This regex uses a negative lookahead assertion to check if the string does not contain a repeated digit. The (?!.(\d)\1) part of the regex uses a capturing group ((\d)) to capture a digit, and then uses the backreference \1 to match the same digit again. The . before the negative lookahead assertion allows the regex to match any characters before the repeated digit, so the repetition can be detected even if it is not immediately adjacent.

The [\d-]* part of the regex allows the string to contain any number of digits or hyphens. The ^ and $ anchors at the beginning and end of the regex ensure that the entire string must match the pattern.

Here are some examples of how this regex can be used:

import re

Test strings that should match the regex

assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '123-45-6789') is not None # Valid SSN assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '9876-543-21') is not None # Valid SSN assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '123456789') is not None # Valid SSN assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '987654321') is not None # Valid SSN

Test strings that should not match the regex

assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '123-45-6789-') is None # Invalid SSN assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '123--45-6789') is None # Invalid SSN assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '123-45-6789-0') is None # Invalid SSN assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '123-45-6789-00') is None # Invalid SSN assert re.fullmatch(r'^(?!.(\d)\1)[\d-]$', '123-45-6789-000') is None # Invalid SSN

  • Related