The objective is to download a tar.gz
from a cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz
The file can be downloaded without any issue with wget
.
!wget cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz --no-check-certificate
However, the download the file using requests
import requests
url='cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'
r = requests.get(url)
Return an error
MissingSchema Traceback (most recent call last)
<ipython-input-11-fa35f2c0ddc0> in <module>()
1 url='cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'
----> 2 r = requests.get(url)
5 frames
/usr/local/lib/python3.7/dist-packages/requests/models.py in prepare_url(self, url, params)
386 error = error.format(to_native_string(url, 'utf8'))
387
--> 388 raise MissingSchema(error)
389
390 if not host:
MissingSchema: Invalid URL 'cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz': No schema supplied. Perhaps you meant http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz?
May I know what is the issue?
CodePudding user response:
You're missing http:// or https:// (the schema, as the error message says) in your url
variable.
url = 'https://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'
CodePudding user response:
You miss the http header
import requests
requests.get("http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz")
This should also work
import wget
wget.download("http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz", out="YOUR_PATH")