Home > Blockchain >  Extracting url from a string with regex and Powershell
Extracting url from a string with regex and Powershell

Time:04-09

I'm using powershell and regex. I'm scraping a web page result to a variable, but I can't seem to extract a generated url from that variable.

this is the content (the actual url varies):

"https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0&

$reg = "([^&]*)&;$" always returns false.

I've been trying -match and Select-String with regex but I'm in need of guidance.

CodePudding user response:

It really depends on what format the content is in.

(?<=\&quot;) looks behind "&quot" for (.*?) which any numbers of non-newline characters and then looks ahead for (?=;) which is ";".

Here's a fair start:

$pattern = "(?<=\&quot;)(.*?)(?=;)"
$someText = "&quot;https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0&amp;"
$newText = [regex]::match($someText, $pattern)
$newText.Value

Returns:

https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0&amp

CodePudding user response:

I suggest using a -replace operation:

$str = '&quot;https://api16-something-c-text.sitename.com/aweme/v2/going/?video_id=v12044gd0666c8ohtdbc77u5ov2cqqd0&amp;' 

$str -replace '^&quot;(. )&amp;$', '$1'
  • Related