I would like to take a table from website, however I need to login in there. When I sign in in the site and get the cookies from there with my program and try to get the table, simply appear a error with html. This error:
"HTTPError Traceback (most recent call last) in () 8 9 print(cookies_dictionary) ---> 10 df = pd.read_html('https://exames.genera.com.br/busca-parentes') 11 df
12 frames /usr/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs) 647 class HTTPDefaultErrorHandler(BaseHandler): 648 def http_error_default(self, req, fp, code, msg, hdrs): --> 649 raise HTTPError(req.full_url, code, msg, hdrs, fp) 650 651 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden"
I tried to get the cookies from the website, but not working unfortunately, here is my code:
import requests
import pandas as pd
a_session = requests.Session()
url = 'https://exames.genera.com.br/busca-parentes'
a_session.get(url)
session_cookies = a_session.cookies
cookies_dictionary = session_cookies.get_dict()
print(cookies_dictionary)
df = pd.read_html(url) #Problem here
df
I know that the authenticate cookies is that:
Someone can help me?
CodePudding user response:
Have you tried using the HTTPBasicAuth from the requests package?
https://docs.python-requests.org/en/latest/user/advanced/#client-side-certificates Towards the end of the page it shows how you can pass a username and password to the website you're attempting to access.
You can use http://httpbin.org/#/ to see if you're sending your request correctly.