Home > Software design >  How to send request payload as parameter in python web scraping?
How to send request payload as parameter in python web scraping?

Time:10-29

I am doing web scraping in python using requests of the following website.

Here in inspect element when you will load the data of any fuel type, you will see a API being triggered by name getFilteredInventery.

Suppose, you have selected Diesel, then I want to scrape all the data which is visible in response section of given API. But I cannot simply open it. In headers section you will see request payload and I think I have send it too as parameter.

I surfed internet and I found out that I can send request payload as param in requests.

This is my code:

payload = {"Search": {"FuelType": ["Diesel"]}, "Sort": {"Featured": "true"}}
url = "https://easyauto123.com.au/api/v1/getFilteredInventory"
page = requests.get(url, params=payload)
data = page.json()

But the page is showing 400 error. So, how could I get this error solved?

API Link: https://easyauto123.com.au/api/v1/getFilteredInventory

CodePudding user response:

You variant:

page = requests.get(url, params=payload)

Try this variant:

page = requests.post(url, json=payload)

CodePudding user response:

The code example you posted results in the requests library making a GET request to the URL with the "payload" you've defined turned into query parameters and returns a 404. You can see this for yourself like this:

payload = {"Search": {"FuelType": ["Diesel"]}, "Sort": {"Featured": "true"}}
url = "https://easyauto123.com.au/api/v1/getFilteredInventory"
page = requests.get(url, params=payload)

print(page.url)
# => https://easyauto123.com.au/api/v1/getFilteredInventory?Search=FuelType&Sort=Featured
print(page.status_code)

The API is expecting a POST request with a JSON payload, which is typically signified during the request by the HTTP header Content-Type having a value of application/json.) The requests library exposes this functionality:

page = requests.post(url, json=payload)

You can evaluate the request headers used for a particular request via the stored request attribute of the returned object:

page = requests.post(url, json=payload)

print(page.request.headers)
# => {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '66', 'Content-Type': 'application/json'

print(page.request.headers['Content-Type'])
# => application/json

You can compare this with the request that results from the following:

page = requests.post(url, params=payload)
print(page.request.url)
# => https://easyauto123.com.au/api/v1/getFilteredInventory?Search=FuelType&Sort=Featured

print(page.request.headers)
# => {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '0'}

You may notice that this request succeeds, though the response body is different. Also, note the Content-Length header's value is 0 (vs 66) and that the Content-Type header is not set. The Content-Length header denotes the size of the payload, which in this case is 0.

  • Related