Home > Software engineering >  How to get value of a link from a website with powershell?
How to get value of a link from a website with powershell?

Time:06-23

I want to get download URL of the last version of GIMP from it's site ,I wrote a script but it returns the link name I do not know how to get the value

$web = Invoke-WebRequest -Uri "https://download.gimp.org/pub/gimp/v2.10/windows/"
$web.Links | Where-Object href -like '*exe' | select -Last 1 | select -expand href  

the above code returne link name (gimp-2.10.32-setup.exe) but I need the value ("https://download.gimp.org/pub/gimp/v2.10/windows/gimp-2.10.32-setup.exe") can someone guide me how to do it

CodePudding user response:

You know that the url presented is relative. Just append the root part of the URL yourself.

  $Uri = 'https://download.gimp.org/pub/gimp/v2.10/windows/'
  $web = Invoke-WebRequest -Uri $uri
  $ExeRelLink =  $web.Links | Where-Object href -like '*exe' | select -Last 1 -expand href 
# Here is your download link.
$DownloadLink = $Uri   $ExeRelLink 

Additional Note

You can combine the -Last and -Expand from your 2 select statements into 1.

CodePudding user response:

There are several downloads sites with exactly the same or very similar layout to this GIMP page, including many Apache projects like Tomcat and ActiveMQ. I had written a little function to parse these and other pages in the past, and interestingly it also worked for this GIMP page. I thought it was worth sharing as such.

Function Extract-FilenameFromWebsite {
    [cmdletbinding()]
    Param(
        [parameter(Position=0,ValueFromPipeline)]
        $Url
    )

    begin{
        $pattern = '<a href. ">(?<FileName>. ?\.. ?)</a>\s (?<Date>\d -. ?)\s{2,}(?<Size>\d \w)?'
    }

    process{
        $website = Invoke-WebRequest $Url -UseBasicParsing

        switch -Regex ($website.Content -split '\r?\n'){
            $pattern {
                [PSCustomObject]@{
                    FileName     = $matches.FileName
                    URL          = '{0}{1}' -f $Url,$matches.FileName
                    LastModified = [datetime]$matches.Date
                    Size         = $matches.Size
                }
            }
        }
    }
}

It's assumed the site passed in has a trailing slash. If you want to account for either, you can add this simple line to the process block.

if($Url -notmatch '/$'){$Url = "$Url/"}

To get the latest version, call the function like this

$url = 'https://download.gimp.org/pub/gimp/v2.10/windows/'

$latest = Extract-FilenameFromWebsite -Url $Url | Where-Object filename -like '*exe' |
    Sort-Object LastModified | Select-Object -Last 1

$latest.url

Or you could expand the property while retrieving

$url = 'https://download.gimp.org/pub/gimp/v2.10/windows/'

$latesturl = Extract-FilenameFromWebsite -Url $Url | Where-Object filename -like '*exe' |
    Sort-Object LastModified | Select-Object -Last 1 -ExpandProperty URL

$latesturl
  • Related