Home > front end >  Selenium Window Scroll to Bottom question
Selenium Window Scroll to Bottom question

Time:11-11

Hello I am trying to use selenium to scrape the title for this page. https://sondors.com/collections/foldable-ebikes

It seems that the elements have to wait me to scroll down the page to show up. So I use :

driver.execute_script("window.scrollTo(0,document.body.scrollHeight);")

before i call

driver.find_elements(By....,...)

But i got nothing printed out Could you please help me to see why it is not working? I am using colab to do it and below is my codes:

!apt-get update
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
!pip install selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
#set up Chrome driver
options=webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
#Define web driver as a Chrome driver
driver=webdriver.Chrome('chromedriver',options=options)
driver.implicitly_wait(10)

url='https://sondors.com/collections/foldable-ebikes'
driver = webdriver.Chrome('chromedriver', options=options)

driver.get(url)
print(driver.title)
driver.implicitly_wait(10)
#driver.find_element(By.ID, 'CybotCookiebotDialogBodyLevelButtonLevelOptinAllowAll').click()
driver.execute_script("window.scrollTo(0,document.body.scrollHeight);")


Titles=driver.find_elements(By.CLASS_NAME,'small-title fx fade-in roll-up animated')
for i in range(len(Titles)):
  print(Titles[i].text)

CodePudding user response:

Try this:

# Needed libs
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# Set up Chrome driver
options=webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

#Define web driver as a Chrome driver
driver=webdriver.Chrome('chromedriver',options=options)
driver = webdriver.Chrome('chromedriver', options=options)

# Maximize window
driver.maximize_window()

# Navigate to url
url = 'https://sondors.com/collections/foldable-ebikes'
driver.get(url)

# We get the titles by XPATH
titles = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@class='wrapper mar-b-100']/div//h2")))

# For every title we get textContent
for title in titles:
    print(title.get_attribute("textContent"))

Why textContent instead of text?

Because with textContent is not needed that the text is visible.

Why your code did not work?

Very related with the previous point: You did scroll till the end of the page, in that moment, when you are at the bottom of the page, the texts are not visible for your screen, so you were getting empty strings. It is because of that I used textContent, because it does not matter if the text is VISIBLE in your screen, it is enough if it is in the DOM.

You can try verify it changing the latest line from my code from:

    print(title.get_attribute("textContent"))

to

    print(title.text)

And you will see also empty strings

How did I proceed?

What is my goal? Getting the titles.

The titles are there when I open the page or I need to do something else? No, titles are there, I do not need to do something else.

Are they visible? No, okay, I can use textContent.

  • Related