I am trying to match specific dates with regex.
What I need to match (without quotes):
-> "September 2020"
-> "September 20"
What should not match(!):
-> "21. September 2020"
-> "21 September 20"
So only the Month September and a year without any day beforehand!
What have I tried:
(?<![0-9][.]) September [1-4]?[0-9]?[0-9][0-9]
This regex works fine if the number before the date has a dot (21. September 2020 does not match but 21 September 2020 does). It isn't possible to make the dot an optional match in the lookbefore because "A quantifier inside a lookbehind makes it non-fixed width".
Further information
-> I need to run this regex with PostgreSQL
-> It's no problem that the regex is pretty simple and has a lot of fixed chars. I don't understand / need this regex wizardry which some people are using for matching dates ;)
Has Anyone an idea how a lookbefore would look like to not match both cases (with and without a dot)?
Thanks so much!
CodePudding user response:
You could add another lookbehind to assert not a digit:
(?<![0-9][.])(?<![0-9]) September [1-4]?[0-9]?[0-9][0-9]
If you only want to match 2 or 4 digits after September:
(?<![0-9][.])(?<![0-9]) September (?:[1-4][0-9])?[0-9][0-9]\y
Where (?:[1-4][0-9])?
is a non capture group that optionally matches 2 digits, and \y
matches at the beginning or end of a word.