I'm trying to figure out any way using requests module to produce score url using a search keyword in a website. For example, when I type this address 820 HABGOOD ST City of White Rock
in the search bar of this website, I get this score url.
I dug around a lot in chrome dev tools to find any way to produce the same score url using requests module, but I ended up getting the following.
import requests
link = 'https://www.walkscore.com/auth/search_suggest'
params = {
'query': '820 HABGOOD ST City of White Rock',
'skip_entities': '0'
}
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
s.headers['X-Requested-With'] = 'XMLHttpRequest'
res = s.get(link,params=params)
print(res.json())
Which produces (no slug or score url in there):
{'query': '820 HABGOOD ST City of White Rock', 'suggestions': [], 'entities': True}
How can I produce score url using an address in the search box?
CodePudding user response:
you missed the good request:
GET: https://www.walkscore.com/score/820-HABGOOD-ST-City-of-White-Rock
this is just your request with dashes replacing spaces.
the request gets a 301 MOVED PERMANENTLY
and sends you to the correct place
import requests
from bs4 import BeautifulSoup
link = 'https://www.walkscore.com/score/'
query='820 HABGOOD ST City of White Rock'
link ='-'.join(query.split())
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
r = s.get(link)
soup=BeautifulSoup(r.text, 'lxml')
print(soup.select_one('#address-header > div > div.float-left-noncleared').text)
>>> 820 Habgood Street White Rock, British Columbia, V4B 4W3