Home > Software engineering >  PowerShell web scraping doesn't work in background
PowerShell web scraping doesn't work in background

Time:10-25

I wrote a script that works fine:

# Use Internet Explorer
$ie = New-Object -ComObject 'internetExplorer.Application'
$ie.Visible= $true # Make it visible
# Set Credentials
$username="[email protected]"
$password="password"
#Navigate to URL
$ie.Navigate("https://service.post.ch/zopa/dlc/app/?service=dlc-web&inMobileApp=false&inIframe=false&lang=fr#!/main")
While ($ie.Busy -eq $true) {Start-Sleep -Seconds 3;}
# Login 
$usernamefield = $ie.document.getElementByID('isiwebuserid')
$usernamefield.value = "$username"
$passwordfield = $ie.document.getElementByID('isiwebpasswd')
$passwordfield.value = "$password"
$Link = $ie.document.getElementByID('actionLogin')
$Link.click()
Start-Sleep -seconds 5
# Find file to download
$link = $ie.Document.getElementsByTagName('A') | where-object {$_.innerText -like 'post_adressdaten*'}
$link.click()
Start-Sleep -seconds 3
# Press "Alt   s" on the download dialog  
Add-Type -AssemblyName System.Windows.Forms
[System.Windows.Forms.SendKeys]::SendWait("%s")
Start-Sleep -seconds 3
# Quit Internet Explorer
$ie.Quit()

But if I change $ie.Visible= $true to $ie.Visible= $false the script doesn't work.

Why?

Because of these two lines:

Add-Type -AssemblyName System.Windows.Forms
[System.Windows.Forms.SendKeys]::SendWait("%s")

In these two lines I'm working on the download dialog box of Internet Explorer and if the browser works in background the the script cannot click on it.

How can I send the input in background or in alternative how to keep Internet Explorer always on top?

CodePudding user response:

As your own answer implies, in order to be able to send keystrokes to an application with [System.Windows.Forms.SendKeys]::SendWait(), it must have a window that is (a) visible and (b) has the (input) focus.

A simpler and faster alternative to the technique shown in your answer - where you use ad-hoc compilation of C# code that wraps WinAPI functions via P/Invoke declarations, via Add-Type - is the following:

# Create an Internet Explorer instance and make it visible.
$ie = New-Object -ComObject 'internetExplorer.Application'; $ie.Visible= $true

# Activate it (give it the focus), via its PID (process ID).
(New-Object -ComObject WScript.Shell).AppActivate(
  (Get-Process iexplore | Where-Object MainWindowHandle -eq $ie.hWnd).Id
)

Taking a step back:

  • GUI scripting (automating a task by simulating user input to a GUI) is inherently unreliable; for instance, the user may click away from the window that is expected to have the focus.

  • While there is no built in solution, it sounds like Selenium offers robust programmatic browser control.

CodePudding user response:

The closest thing I have found is this and it's ugly as hell:

# Start Internet Explorer on top
Add-Type -TypeDefinition @"
    using System;
    using System.Runtime.InteropServices;

    public class Win32SetWindow {
        [DllImport("user32.dll")]
        [return: MarshalAs(UnmanagedType.Bool)]
        public static extern bool SetForegroundWindow(IntPtr hWnd);
    }
"@

$ie = new-object -comobject InternetExplorer.Application;
$ie.visible = $true;

    [Win32SetWindow]::SetForegroundWindow($ie.HWND) # <--   Internet Explorer window on top

# Set Credentials
$username="[email protected]"
$password="password"
#Navigate to URL
$ie.Navigate("https://service.post.ch/zopa/dlc/app/?service=dlc-web&inMobileApp=false&inIframe=false&lang=fr#!/main")
While ($ie.Busy -eq $true) {Start-Sleep -Seconds 3;}
# Login 
$usernamefield = $ie.document.getElementByID('isiwebuserid')
$usernamefield.value = "$username"
$passwordfield = $ie.document.getElementByID('isiwebpasswd')
$passwordfield.value = "$password"
$Link = $ie.document.getElementByID('actionLogin')
$Link.click()
Start-Sleep -seconds 5
# Find file to download
$link = $ie.Document.getElementsByTagName('A') | where-object {$_.innerText -like 'post_adressdaten*'}
$link.click()
Start-Sleep -seconds 3
# Press "Alt   s" on the download dialog  
Add-Type -AssemblyName System.Windows.Forms
[System.Windows.Forms.SendKeys]::SendWait("%n{TAB}{ENTER}")     # or use SendWait("%s")
Start-Sleep -seconds 3
# Quit Internet Explorer
$ie.Quit()
  • Related