Home > Blockchain >  Failed to produce content url using an address in a search box of a website
Failed to produce content url using an address in a search box of a website

Time:12-15

I'm trying to figure out any way using requests module to produce score url using a search keyword in a website. For example, when I type this address 820 HABGOOD ST City of White Rock in the search bar of this website, I get this score url.

I dug around a lot in chrome dev tools to find any way to produce the same score url using requests module, but I ended up getting the following.

import requests

link = 'https://www.walkscore.com/auth/search_suggest'
params = {
    'query': '820 HABGOOD ST City of White Rock',
    'skip_entities': '0'
}

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    s.headers['X-Requested-With'] = 'XMLHttpRequest'
    res = s.get(link,params=params)
    print(res.json())

Which produces (no slug or score url in there):

{'query': '820 HABGOOD ST City of White Rock', 'suggestions': [], 'entities': True}

How can I produce score url using an address in the search box?

CodePudding user response:

you missed the good request:

GET: https://www.walkscore.com/score/820-HABGOOD-ST-City-of-White-Rock

this is just your request with dashes replacing spaces. the request gets a 301 MOVED PERMANENTLY and sends you to the correct place

import requests
from bs4 import BeautifulSoup

link = 'https://www.walkscore.com/score/'
query='820 HABGOOD ST City of White Rock'
link ='-'.join(query.split())

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    r = s.get(link)
    soup=BeautifulSoup(r.text, 'lxml')
    print(soup.select_one('#address-header > div > div.float-left-noncleared').text)
    
>>> 820 Habgood Street  White Rock, British Columbia, V4B 4W3
  • Related