The goal of my regular expression adventure is to create a matcher for a mechanism that could add a trailing slash to URLs, even in the presence of parameters denoted by #
or ?
at the end of the URL.
For any of the following URLs, I'm looking for a match for segment
as follows:
https://example.com/what-not/segment
matchessegment
https://example.com/what-not/segment?a=b
matchessegment
https://example.com/what-not/segment#a
matchessegment
In case there is a match for segment,
I'm going to replace it with segment/
.
For any of the following URLs, there should be no match:
https://example.com/what-not/segment/
no matchhttps://example.com/what-not/segment/?a=b
no matchhttps://example.com/what-not/segment/#a
no match
because here, there is already a trailing slash.
I've tried:
- This primitive regex and their variants:
.*\/([^?#\/] )
. However, with this approach, I could not make it not match when there is already a trailing slash. - I experimented with negative lookaheads as follows:
([^\/\#\?] )(?!(.*[\#\?].*))$
. In this case, I could not get rid of any?
or#
parts properly.
Thank you for your kind help!
CodePudding user response:
Lookahead and lookbehind conditionals are so powerful!
(?<=\/)[\w] (?(?=[\?\#])|$)
P.s: I just added [\w]
that means [a-zA-Z0-9_]
.
Of course URLs can contain many other character like -
or ~
but for the examples provided it works nicely.