Home > OS >  Python Regular expression to match printing pages and their range
Python Regular expression to match printing pages and their range

Time:10-15

I have this regular expression that matches any print page specifications (ex: 6, 1-6, 6:4, 10-20/3)

^([1-9]\d*)((?<=\d)[-]|[:]?)((?<=-|:)?[1-9]\d*)?(?:(?<=)([/]?))([1-9]\d*)?$

and I have it so that it currently matches: ex: 2048-4096/100 , 15:10/3

However, my regular expression also matches 5/3 when / should only follow a colon or dash, and some digits, like 2048-4096/100

In the empty positive lookbehind in the above expression I've tried: (?:(?<=[:|-]\d)([/]?)) but that causes all my tests to fail, resulting in no matches. I've also tried (?:(?<=[:|-]\d*)([/]?)) but quantifiers are not allowed in the lookbehind.

What can I put in the empty positive lookbehind to make it so that it will check if a : or - and digits are before the /?

CodePudding user response:

You can use

^([1-9]\d*)(?:([-:])([1-9]\d*)(?:(/)([1-9]\d*))?)?$

See the regex demo. Details:

  • ^ - start of string
  • ([1-9]\d*) - Group 1: a non-zero digit and then zero or more digits
  • (?:([-:])([1-9]\d*)(?:(/)([1-9]\d*))?)? - an optional occurrence of
    • ([-:]) - Group 2: - or :
    • ([1-9]\d*) - Group 3: a non-zero digit and then zero or more digits
    • (?:(/)([1-9]\d*))? - an optional occurrence of
      • (/) - Group 4: /
      • ([1-9]\d*) - Group 5: a non-zero digit and then zero or more digits
  • $ - end of string.

I kept all groups intact, but at least (/) group is redundant as the pattern is fixed as /.

  • Related