Home > Mobile >  Unable to download url link with requests in Python
Unable to download url link with requests in Python

Time:06-10

The objective is to download a tar.gz from a cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz

The file can be downloaded without any issue with wget.

!wget cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz --no-check-certificate

However, the download the file using requests

import requests
url='cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'
r = requests.get(url) 

Return an error

MissingSchema                             Traceback (most recent call last)

<ipython-input-11-fa35f2c0ddc0> in <module>()
      1 url='cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'
----> 2 r = requests.get(url)

5 frames

/usr/local/lib/python3.7/dist-packages/requests/models.py in prepare_url(self, url, params)
    386             error = error.format(to_native_string(url, 'utf8'))
    387 
--> 388             raise MissingSchema(error)
    389 
    390         if not host:

MissingSchema: Invalid URL 'cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz': No schema supplied. Perhaps you meant http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz?

May I know what is the issue?

CodePudding user response:

You're missing http:// or https:// (the schema, as the error message says) in your url variable.

url = 'https://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz'

CodePudding user response:

You miss the http header

import requests
requests.get("http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz")

This should also work

import wget
wget.download("http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K-Word_V3.0.tar.gz", out="YOUR_PATH")
  • Related