I am trying to download a file with Selenium, Geckodriver and Firefox all controlled from Python. The file actually get downloaded but driver still processing something even after file gets downloaded.
Code I use to download a file:
from selenium import webdriver
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.dir", downloaddir)
fp.set_preference("browser.download.useDownloadDir", True)
fp.set_preference("browser.download.viewableInternally.enabledTypes", "")
fp.set_preference("browser.download.manager.useWindow", False)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.manager.closeWhenDone", True);
fp.set_preference('browser.helperApps.neverAsk.openFile', "application/zip")
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/zip")
fp.set_preference("pdfjs.disabled", True)
driver = webdriver.Firefox(firefox_profile=fp)
driver.get('http://speedtest.tele2.net/10MB.zip')
driver.close() # this code never gets called
Does anyone know what's going on? I know there is workaround when you click on element. The problem is I work with composed url which cannot be clicked but needs to be accessed directly.
Versions (linux):
Gecko 0.29.1
Firefox 89.0
Python 3.9.5
Update
There is implicit timeout configured to 5min and after that it will fail.
So my question is: Is there a way how to download a file directly implemented in selenium without raising any kind of error (in ideal case of course)?
CodePudding user response:
As suggested by @cards it is more convenient to use requests
or urllib
for this kind of work. You can use selenium
to paginate or click, and then use requests
by inspecting the website HTML.
import requests
# retrieve the web content
response = requests.get("http://speedtest.tele2.net/10MB.zip")
# save it as local file
with open("filename.zip", "wb") as file:
file.write(response.content)
P.S. The zip file that gets downloaded by your provided URL is damaged.