Home > Net >  Selenium Webdriver Not Finding Elements/Behavior Changed
Selenium Webdriver Not Finding Elements/Behavior Changed

Time:12-07

Previously working scrape code (Py/Selenium) broke. Now it returns a blank list. As a single example it is meant to visit: https://powersearch.jll.com/ca-en/property/52770/centurion-plaza-10335-172-street

And return me the pdf link: https://powersearch.jll.com/res/docs/jll - centurion plaza - brochure - 02232022_11108659.pdf

driver_service = Service(executable_path="C:\\WPy64-39100\\chromedriver.exe")
chrome_options = Options()
chrome_options.add_experimental_option("detach", True)
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(service = driver_service, options=chrome_options)

site = 'https://powersearch.jll.com/ca-en/property/52770/centurion-plaza-10335-172-street'

driver.get(site)
time.sleep(10)

elements = driver.find_elements(By.CLASS_NAME, 'pt-res-link')
links = [e.get_attribute("href") for e in elements]

print(links)

I've tried various iterations of find element (and find elements) and trying to use class = "pt-res-link" is not reliably working. Any advice appreciated, thanks.

CodePudding user response:

I have used request and bs4 to get your desired output

Full Code

import requests
from bs4 import BeautifulSoup
url = "https://powersearch.jll.com/ca-en/property/52770/centurion-plaza-10335-172-street"
headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"
}
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.content, "html.parser")
script = soup.find("script", {"id": "PowerSearch-state"}).text.split(";")
for i in script:
    if ".pdf" in i and "https://" in i:
        print(i.split("&q")[0])

Output

https://powersearch.jll.com/res/docs/jll - centurion plaza - brochure - 02232022_11108659.pdf
  • Related