Home > Software design >  Regex to match two parts of a url
Regex to match two parts of a url

Time:05-13

This is my first question for Stackoverflow... apologies in advance if I break a rule in asking a question. I have searched for my question and was not able to find anything related to what I'm looking for, and I have read through the question posting guide...

I am trying to create a RegEx pattern which will match two parts of a URL.

Example URL:

app.company.com/base-path?parameter1=stuff&parameter2=morestuff&parameter3=IMPORTANT THING

In this case I want the pattern to match in the case that there is a base path and the third parameter, so both: /base-path and all of parameter3=IMPORTANT THING

Any help would be appreciated! Please let me know if I can provide more info...

CodePudding user response:

Here is my answer, and you can test that here

/^. ?(\/. ?)\?. ?&(parameter3=. )$/gm

I do not know which language you use, this is PCRE2 version which is used for PHP 7.3 , but I think it is easy to migrate to other language.

Security risk

There are some risk when using regex, for that bad guys can construct malicious parameter1 or parameter2 to spoof regex and you will get unexpected result, especially AFTER DECODING URL.

For example url

app.company.com/base-path?parameter1=stuff&parameter2=¶meter3=morestuff&parameter3=IMPORTANT THING

Bad guys set parameter2=¶meter3=morestuff, and after decoding, you will get this url

app.company.com/base-path?parameter1=stuff&parameter2=&parameter3=morestuff&parameter3=IMPORTANT THING

And what you get from regex is parameter3=morestuff&parameter3=IMPORTANT THING, which is unexpected.

So, if you really want to use regex, DO NOT DECODE URL BEFORE MATCHING

  • Related