when i run this
wget.download("http://downloads.dell.com/FOLDER06808437M/1/7760 AIO-WIN10-A11-5VNTG.CAB")
It shows this Error Code
File "C:\Program Files\Python39\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)urllib.error.HTTPError: HTTP Error 404: Not Found
But when I run wget http://downloads.dell.com/FOLDER06808437M/1/7760 AIO-WIN10-A11-5VNTG.CAB
manually it works perfectly fine
CodePudding user response:
I investigated wget.download
source code and there seems to be bug, piece of source code
if PY3K:
# Python 3 can not quote URL as needed
binurl = list(urlparse.urlsplit(url))
binurl[2] = urlparse.quote(binurl[2])
binurl = urlparse.urlunsplit(binurl)
else:
binurl = url
so it make assumption URL needs to be quoted that is illegal character like space replaced by codes after %
sign, but this was already done as your url contain
rather than space. Your URL is altered although it should not
import urllib.parse as urlparse
url = "http://downloads.dell.com/FOLDER06808437M/1/7760 AIO-WIN10-A11-5VNTG.CAB"
binurl = list(urlparse.urlsplit(url))
binurl[2] = urlparse.quote(binurl[2])
binurl = urlparse.urlunsplit(binurl)
print(binurl) # http://downloads.dell.com/FOLDER06808437M/1/7760%20AIO-WIN10-A11-5VNTG.CAB
You might either counterweight this issue by providing URL in form which needs escaping, in this case
import wget
wget.download("http://downloads.dell.com/FOLDER06808437M/1/7760 AIO-WIN10-A11-5VNTG.CAB")
xor use urllib.request.urlretrieve
, most basic form is
import urllib.request
urllib.request.urlretrieve("http://downloads.dell.com/FOLDER06808437M/1/7760 AIO-WIN10-A11-5VNTG.CAB", "776 AIO-WIN10-A11-5VNTG.CAB")
where arguments are URL and filename. Keep in mind that used this way there is not progress indicator (bar), so you need to wait until download complete.