Home > Blockchain >  Why won't python request pagination work?
Why won't python request pagination work?

Time:01-16

I'm trying to use pagination to request multiple pages of rent listing from zillow. Otherwise I'm limited to the first page only. However, my code seems to load the first page only even if I specify specific pages manually.

# Rent
import requests
from bs4 import BeautifulSoup as soup
import json

url = 'https://www.zillow.com/torrance-ca/rentals'

params = {
  'q': {"pagination":{"currentPage": 1},"isMapVisible":False,"filterState":{"fore":{"value":False},"mf":{"value":False},"ah":{"value":True},"auc":{"value":False},"nc":{"value":False},"fr":{"value":True},"land":{"value":False},"manu":{"value":False},"fsbo":{"value":False},"cmsn":{"value":False},"fsba":{"value":False}},"isListVisible":True}
}

headers = {
  # headers were copied from network tab on developer tools in chrome
}


html = requests.get(url=url,headers=headers, params=params)
html.status_code

bsobj = soup(html.content, 'lxml')

for script in bsobj.find_all('script'):

  inner_text_with_string = str(script.string)
  if inner_text_with_string[:18] == '<!--{"queryState":':
    my_query = inner_text_with_string

my_query = my_query.strip('><!-')

data = json.loads(my_query)
data = data['cat1']['searchResults']['listResults']

print(data)

This returns about 40 listings. However, if I change "pagination":{"currentPage": 1} to "pagination":{"currentPage": 2}, it returns the same listings! It's as if the pagination parameter isn't recognized.

I believe these are the correct parameters, as I took them straight from the url string query and used http://urlprettyprint.com/ to make it pretty.

Any thoughts on what I'm doing wrong?

CodePudding user response:

Using the params argument with requests is sending the wrong data, you can confirm this by printing response.url. what i would do is use urllib.parse.urlencode:

from urllib.parse import urlencode
...
html = requests.get(url=f"{url}?{urlencode(params)}", headers=headers)
  • Related