I am doing web scraping in python using requests of the following website.
Here in inspect element when you will load the data of any fuel type, you will see a API being triggered by name getFilteredInventery
.
Suppose, you have selected Diesel, then I want to scrape all the data which is visible in response section of given API. But I cannot simply open it. In headers section you will see request payload and I think I have send it too as parameter.
I surfed internet and I found out that I can send request payload as param in requests.
This is my code:
payload = {"Search": {"FuelType": ["Diesel"]}, "Sort": {"Featured": "true"}}
url = "https://easyauto123.com.au/api/v1/getFilteredInventory"
page = requests.get(url, params=payload)
data = page.json()
But the page is showing 400 error. So, how could I get this error solved?
API Link: https://easyauto123.com.au/api/v1/getFilteredInventory
CodePudding user response:
You variant:
page = requests.get(url, params=payload)
Try this variant:
page = requests.post(url, json=payload)
CodePudding user response:
The code example you posted results in the requests
library making a GET
request to the URL with the "payload" you've defined turned into query parameters and returns a 404. You can see this for yourself like this:
payload = {"Search": {"FuelType": ["Diesel"]}, "Sort": {"Featured": "true"}}
url = "https://easyauto123.com.au/api/v1/getFilteredInventory"
page = requests.get(url, params=payload)
print(page.url)
# => https://easyauto123.com.au/api/v1/getFilteredInventory?Search=FuelType&Sort=Featured
print(page.status_code)
The API is expecting a POST
request with a JSON payload, which is typically signified during the request by the HTTP header Content-Type
having a value of application/json
.) The requests
library exposes this functionality:
page = requests.post(url, json=payload)
You can evaluate the request headers used for a particular request via the stored request
attribute of the returned object:
page = requests.post(url, json=payload)
print(page.request.headers)
# => {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '66', 'Content-Type': 'application/json'
print(page.request.headers['Content-Type'])
# => application/json
You can compare this with the request that results from the following:
page = requests.post(url, params=payload)
print(page.request.url)
# => https://easyauto123.com.au/api/v1/getFilteredInventory?Search=FuelType&Sort=Featured
print(page.request.headers)
# => {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '0'}
You may notice that this request succeeds, though the response body is different. Also, note the Content-Length
header's value is 0
(vs 66
) and that the Content-Type
header is not set. The Content-Length
header denotes the size of the payload, which in this case is 0
.