How to click show more button while scraping from multiple pages?-CodePudding

I have a script that scrapes from 10 multiple pages at once.

#hyperlink_list is the list of the pages
options = webdriver.ChromeOptions()

driver = webdriver.Chrome(ChromeDriverManager().install(),options=options)
for i in range(0,10):
    url = hyperlink_list[i]
    sleep(randint(10, 24))
    driver.get(url)
    time.sleep(10)
    soup = BeautifulSoup(driver.page_source, 'html.parser')

Now from the pages, I am extracting this part:

In only some pages, there is the show more link where the description is longer. I want to click this link, and extract the description whenever the show more link is available.

Code for show more link:

<a id="rfq-info-header-description-showmorebutton">
              show more
            </a>

I want to click this link only if it's available, otherwise it will show element not found error.

CodePudding user response：

Use more = driver.find_element_by_id("rfq-info-header-description-showmorebutton") (assuming the more link can always be found using this id). If the more button is not found, this will throw an exception. (see here for details)

CodePudding user response：

You should try-except block and we should look for show more web element. Below I am using find_elements (plural) and len() to get the size, if >0 then web element must be present and then trying to click on it using Explicit waits.

If size is not >0 then show more should not be visible and I am just printing a simple print statement in that block.

Code :

try:
    if len(driver.find_elements(By.XPATH, "//a[@id='rfq-info-header-description-showmorebutton']")) >0 :
        print("Show more link is available so Selenium bot will click on it.")
        WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.XPATH, "//a[@id='rfq-info-header-description-showmorebutton']"))).click()
        print('Clicked on show more link')
    else:
        print("Show more link is not available")
except:
    print('Something else went wrong.')
    pass

Imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC