Headless mode disable python web file downloading-CodePudding

I aim to download web files while in headless mode. My program downloads perfectly when NOT in headless mode, but once I add the constraint not to show MS Edge opening, the downloading is disregarded.

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select

driver = webdriver.Edge()
driver.get("URL")

id_box = driver.find_element(By.ID,"...")
pw_box = driver.find_element(By.ID,"...")
id_box.send_keys("...")
pw_box.send_keys("...")
log_in = driver.find_element(By.ID,"...")
log_in.click()

time.sleep(0.1) # If not included, get error: "Unable to locate element"

drop_period = Select(driver.find_element(By.ID,"..."))
drop_period.select_by_index(1)
drop_consul = Select(driver.find_element(By.ID,"..."))
drop_consul.select_by_visible_text("...")
drop_client = Select(driver.find_element(By.ID,"..."))
drop_client.select_by_index(1)

# Following files do not download with headless inculded:

driver.find_element(By.XPATH, "...").click()
driver.find_element(By.XPATH, "...").click()

CodePudding user response：

In that case, you might try downloading the file using the direct link (to the file) and python requests.

You'll need to get the url, by parsing the elemt its href:

Downloading and saving a file from url should work as following then:

import requests as req

remote_url = 'http://www.example.com/file.txt'
local_file_name = 'my_file.txt'

data = req.get(remote_url)

# Save file data to local copy
with open(local_file_name, 'wb')as file:
    file.write(data.content)

resource

CodePudding user response：

There are different headless modes for Chrome. If you want to download files, use one of the special ones.

For Chrome 109 and above, use:

options.add_argument("--headless=new")

For Chrome 108 and below, use:

options.add_argument("--headless=chrome")

Reference: https://github.com/chromium/chromium/commit/e9c516118e2e1923757ecb13e6d9fff36775d1f4