I'm trying to get a bunch of links of the houses from this website but it only returns only about 9 elements even though it has more elements. I also tried using Beautiful Soup but the same thing happens and it doesn't return all elements.
With Selenium:
for i in range(10):
time.sleep(1)
scr1 = driver.find_element_by_xpath('//*[@id="search-page-list-container"]')
driver.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight", scr1)
link_tags = driver.find_elements_by_css_selector(".list-card-info a")
links = [link.get_attribute("href") for link in link_tags]
pprint(links)
With bs4:
headers = {
'accept-language': 'en-GB,en;q=0.8,en-US;q=0.6,ml;q=0.4',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)'
'Chrome/74.0.3729.131 Safari/537.36'
}
response = requests.get(ZILLOW_URL, headers=headers)
website_content = response.text
soup = BeautifulSoup(website_content, "html.parser")
link_tags = soup.select(".list-card-info a")
link_list = [link.get("href") for link in link_tags]
pprint(link_list)
Output:
'https://www.zillow.com/b/407-fairmount-ave-oakland-ca-9NTzMK/',
'https://www.zillow.com/homedetails/1940-Buchanan-St-A-San-Francisco-CA-94115/15075413_zpid/',
'https://www.zillow.com/homedetails/2380-California-St-QZ6SFATJK-San-Francisco-CA-94115/2078197750_zpid/',
'https://www.zillow.com/homedetails/5687-Miles-Ave-Oakland-CA-94618/299065263_zpid/',
'https://www.zillow.com/b/olume-san-francisco-ca-65f3Yr/',
'https://www.zillow.com/homedetails/29-Balboa-St-APT-1-San-Francisco-CA-94118/2092859824_zpid/']
Is there any way to tackle this problem? I would really appreciate the help.
CodePudding user response:
You have to scroll to each element one by one in a loop and then have to look for descendant anchor tag which has the href
.
driver.maximize_window()
#driver.implicitly_wait(30)
wait = WebDriverWait(driver, 50)
driver.get("https://www.zillow.com/homes/for_rent/1-_beds/?searchQueryState={"pagination":{},"mapBounds":{"west":-123.7956336875,"east":-121.6368202109375,"south":37.02044483468766,"north":38.36482775108166},"isMapVisible":true,"filterState":{"price":{"max":872627},"beds":{"min":1},"fore":{"value":false},"mp":{"max":3000},"auc":{"value":false},"nc":{"value":false},"fr":{"value":true},"fsbo":{"value":false},"cmsn":{"value":false},"fsba":{"value":false}},"isListVisible":true,"mapZoom":9}")
j = 1
for i in range(len(driver.find_elements(By.XPATH, "//article"))):
all_items = driver.find_element_by_xpath(f"(//article)[{j}]")
driver.execute_script("arguments[0].scrollIntoView(true);", all_items)
print(all_items.find_element_by_xpath('.//descendant::a').get_attribute('href'))
j = j 1
time.sleep(2)
Imports:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Output :
https://www.zillow.com/b/bay-village-vallejo-ca-5XkKWj/
https://www.zillow.com/b/waterbend-apartments-san-francisco-ca-9NLgqG/
https://www.zillow.com/b/the-verdant-apartments-san-jose-ca-5XsGhW/
https://www.zillow.com/homedetails/1539-Lincoln-Ave-San-Rafael-CA-94901/80743209_zpid/
https://www.zillow.com/b/the-crossing-at-arroyo-trail-livermore-ca-5XjR44/
https://www.zillow.com/homedetails/713-Trancas-St-APT-4-Napa-CA-94558/2081608744_zpid/
https://www.zillow.com/b/americana-apartments-mountain-view-ca-5hGhMy/
https://www.zillow.com/b/jackson-arms-apartments-hayward-ca-5XxZLv/
https://www.zillow.com/b/elan-at-river-oaks-san-jose-ca-5XjLQF/
https://www.zillow.com/homedetails/San-Francisco-CA-94108/2078592726_zpid/
https://www.zillow.com/homedetails/20914-Cato-Ct-Castro-Valley-CA-94546/2068418792_zpid/
https://www.zillow.com/homedetails/1240-21st-Ave-3-San-Francisco-CA-94122/2068418798_zpid/
https://www.zillow.com/homedetails/1246-Walker-Ave-APT-207-Walnut-Creek-CA-94596/18413629_zpid/
https://www.zillow.com/b/the-presidio-fremont-ca-5Xk3QQ/
https://www.zillow.com/homedetails/1358-Noe-St-1-San-Francisco-CA-94131/2068418857_zpid/
https://www.zillow.com/b/the-estates-at-park-place-fremont-ca-5XjVpg/
https://www.zillow.com/homedetails/2060-Camel-Ln-Walnut-Creek-CA-94596/2093645611_zpid/
https://www.zillow.com/b/840-van-ness-san-francisco-ca-5YCwMj/
https://www.zillow.com/homedetails/285-Grand-View-Ave-APT-6-San-Francisco-CA-94114/2095256302_zpid/
https://www.zillow.com/homedetails/929-Oak-St-APT-3-San-Francisco-CA-94117/2104800238_zpid/
https://www.zillow.com/homedetails/420-N-Civic-Dr-APT-303-Walnut-Creek-CA-94596/18410162_zpid/
https://www.zillow.com/homedetails/1571-Begen-Ave-Mountain-View-CA-94040/19533010_zpid/
https://www.zillow.com/homedetails/145-Woodbury-Cir-D-Vacaville-CA-95687/2068419093_zpid/
https://www.zillow.com/b/trinity-towers-apartments-san-francisco-ca-5XjPdR/
https://www.zillow.com/b/hidden-creek-vacaville-ca-5XjV3h/
https://www.zillow.com/homedetails/19-Belle-Ave-APT-7-San-Anselmo-CA-94960/2081212106_zpid/
https://www.zillow.com/homedetails/1560-Jackson-St-APT-11-Oakland-CA-94612/2068419279_zpid/
https://www.zillow.com/homedetails/1465-Marchbanks-Dr-APT-2-Walnut-Creek-CA-94598/18382713_zpid/
https://www.zillow.com/homedetails/205-Morning-Sun-Ave-B-Mill-Valley-CA-94941/2077904048_zpid/
https://www.zillow.com/homedetails/1615-Pacific-Ave-B-Alameda-CA-94501/2073535331_zpid/
https://www.zillow.com/homedetails/409-S-5th-St-1-San-Jose-CA-95112/2078856409_zpid/
https://www.zillow.com/homedetails/5635-Anza-St-P5G3CZYNW-San-Francisco-CA-94121/2068419581_zpid/
https://www.zillow.com/b/407-fairmount-ave-oakland-ca-9NTzMK/
https://www.zillow.com/homedetails/1940-Buchanan-St-A-San-Francisco-CA-94115/15075413_zpid/
https://www.zillow.com/homedetails/2380-California-St-QZ6SFATJK-San-Francisco-CA-94115/2078197750_zpid/
https://www.zillow.com/homedetails/1883-Agnew-Rd-UNIT-241-Santa-Clara-CA-95054/79841436_zpid/
https://www.zillow.com/b/marina-playa-santa-clara-ca-5XjKBc/
https://www.zillow.com/b/birch-creek-mountain-view-ca-5XjKKB/
https://www.zillow.com/homedetails/969-Clark-Ave-D-Mountain-View-CA-94040/2068419946_zpid/
https://www.zillow.com/homedetails/74-Williams-St-San-Leandro-CA-94577/24879175_zpid/
CodePudding user response:
The problem is the website. It adds the links dynamicaly, so you can try scrolling to the bottom of the page and than searching for the links.
bottomFooter = driver.find_element_by_id("region-info-footer")
driver.execute_script("arguments[0].scrollIntoView();", bottomFooter)