I am trying to to a very simple python request using requests.get
but am getting the following error using this code:
url = 'https://www.tesco.com/'
status = requests.get(url)
The error:
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.tesco.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)')))
Can anyone explain to me how to fix this and more importantly what the error means?
Many Thanks
CodePudding user response:
Explanation
The errors is caused by an invalid or expired SSL Certificate
When making a GET request to a server such as www.tesco.com
you have 2 options, an http and an https, in the case of https the server will provide your requestor (your script) with an SSL certificate which allows you to verify that you are connecting to a legitimate website, also this helps secure and encrypt the data being transfered between your script and the server
Solution
Just disable the SSL check
url = 'https://www.tesco.com/'
requests.get(url, verify=False)
OR
Use Session and Disable the SSL Cert Check
import requests, os
url = 'https://www.tesco.com/'
# Use Session and Disable the SSL Cert Check
session = requests.Session()
session.verify = False
session.trust_env = False
session.get(url=url)
Extra Info 1
Ensure the date and time is set correctly, as the request library checks the valid date range that the SSL certificate is valid in compared to your local date and time. as this is sometimes a common issue
Extra Info 2
You may need to get the latest updated Root CA Certificates installed on your machine Download Here
CodePudding user response:
Paraphrasing similar post to your specific question.
Response 403 means forbidden, in other words, the website understands the request but doesn't allow access. It could be a security measure to prevent scraping.
As a workaround, you can add a header in your request so that the code acts as if you're accessing it using a web browser.
url = "https://www.tesco.com"
headers = {'user-agent': 'Safari/537.36'}
response = requests.get(url, headers=headers)
print(response)
You should get response 200.
'user-agent' in the headers makes it seem that you're accessing through a Safari browser.