Home > Software engineering >  Date Regex matching with lookbehind
Date Regex matching with lookbehind

Time:08-11

I am trying to match specific dates with regex.

What I need to match (without quotes):
-> "September 2020"
-> "September 20"

What should not match(!):
-> "21. September 2020"
-> "21 September 20"

So only the Month September and a year without any day beforehand!

What have I tried:

(?<![0-9][.]) September [1-4]?[0-9]?[0-9][0-9]

This regex works fine if the number before the date has a dot (21. September 2020 does not match but 21 September 2020 does). It isn't possible to make the dot an optional match in the lookbefore because "A quantifier inside a lookbehind makes it non-fixed width".

Further information
-> I need to run this regex with PostgreSQL
-> It's no problem that the regex is pretty simple and has a lot of fixed chars. I don't understand / need this regex wizardry which some people are using for matching dates ;)

Has Anyone an idea how a lookbefore would look like to not match both cases (with and without a dot)?

Thanks so much!

CodePudding user response:

You could add another lookbehind to assert not a digit:

(?<![0-9][.])(?<![0-9]) September [1-4]?[0-9]?[0-9][0-9]

Regex demo

If you only want to match 2 or 4 digits after September:

(?<![0-9][.])(?<![0-9]) September (?:[1-4][0-9])?[0-9][0-9]\y

Where (?:[1-4][0-9])? is a non capture group that optionally matches 2 digits, and \y matches at the beginning or end of a word.

  • Related