I have the following strings
https://test.com/fi/wp-content
https://test.com/fr
https://test.com/es
https://test.com/
https://test.com/wp-content/
https://test.com/image.png
https://test.com/de/wp-content/themes
https://test.com/es
https://test.com/fr
https://test.com/no
https://test.com/da
https://test.com/en
https://test.com/de
https://test.com/nl/wp-content
https://test.com/fi
As far as now I have the following regex
/\btest.com.*\.*(?<!fr|es|da|no|en|de|nl|fi)$/gm
I want to match the following (image 1)
Im almost there but my regex matches everything after my expression like this (image 2):
I can seem to figure out how to get the end of my regex to behave so it produces the match as image 1. Here is a regex101: https://regex101.com/r/Tv0AjJ/1
CodePudding user response:
Currently this part .*(?<!fr|es|da|no|en|de|nl|fi)$
matches until the end of the string and asserts what is to the left is not any of the alternatives, that is why /es
does not match but .png
does.
You can match the /
and then assert not any of the alternatives directly to the right using a negative lookahead (?!
Note to escape the the dot.
\btest\.com\/(?!fr|es|d[ae]|no|en|nl|fi)
If you don't want partial matches, you can either group the alternatives themselves again follwed by a word boundary \b
or as Wiktor Stribiżew
mentioned in the comments a forward slash or the end of the string (?:\/|$)
Alternatives that have the same character can be grouped together in a character class d[ae]
\btest\.com\/(?!(?:fr|es|d[ae]|no|en|nl|fi)\b)