Home > Blockchain >  Regex(es) to extract the URL from different strings
Regex(es) to extract the URL from different strings

Time:11-13

What regex(es) will extract the URL from strings with these patterns?

https://xxx##.safelinks.protection.outlook.com/?url=[encoded URL to extract]&data=[more detritus]
https://example.com/link/?url=[encoded URL to extract]?l=en-us
https://example.com/link/?url=[encoded URL to extract]

The first part will be \?url=; I am less certain about what comes next, and whether I need to use separate regexes for each pattern. Taking the first pattern,

https://xxx##.safelinks.protection.outlook.com/?url=https://www.domain.com/subd/doc.aspx/&data=[more detritus]

I would want to extract https://www.domain.com/subd/doc.aspx/ (to decode with an existing function.)

CodePudding user response:

Assuming all the URL parameters are separated correctly with ampersands (&), then this should work, I think: url=(. ?)(&|$)

CodePudding user response:

Try this:

(?<=url=)[^&\s] %2[fF](?:[^&\s%]*)

See regex demo.

  • Related