Home > OS >  Unable to click Next button using selenium as number of pages are unknown
Unable to click Next button using selenium as number of pages are unknown

Time:10-07

I am new to selenium and trying to scrape:-

https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/

I need all the details mentioned on this page an others as well.

Also, there are certain more pages containing the same information, need to scrape them as well. I try to scrape by making changes to the target URL:-

https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/40

but the last item is changing and is not even similar to the page number. Page number 3 is having 40 at the end and page number 5:-

https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/80

so not able to get the data through that.

Here is my code:-

def extract_url():
    url = driver.find_elements(By.XPATH,"//h2[@class='resultTitle']//a")
    for i in url:
        dist.append(i.get_attribute("href"))
        
    driver.execute_script("window.scrollTo(0,document.body.scrollHeight)")
    
    driver.find_element(By.XPATH,"//li[@class='btnNextPre']//a").click()

for _ in range(10):
    extract_url()

working fine till page 5 but not after that. Could you please suggest how can I iterate over pages where the we don't know the number of pages and can extract data till teh last page.

CodePudding user response:

You need the check the pagination link is disabled. Use infinite loop and check for pagination button is disabled.

Use WebDriverWait() and wait for visibility of the element.

Code:

driver.get("https://www.asklaila.com/search/Delhi-NCR/-/book-distributor/")
counter=1
while(True):        
    WebDriverWait(driver,20).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"h2.resultTitle >a")))
    urllist=[item.get_attribute('href') for item in driver.find_elements(By.CSS_SELECTOR, "h2.resultTitle >a")]
    print(urllist)
    print("Page number :"  str(counter))    
    driver.execute_script("arguments[0].click();", driver.find_element(By.CSS_SELECTOR, "ul.pagination >li.btnNextPre>a"))    
    #check for pagination button disabled
    if len(driver.find_elements(By.XPATH, "//li[@class='disabled']//a[text()='>']"))>0:
        print("pagination not found!!!")
        break
    time.sleep(2) #To slowdown the loop
    counter=counter 1

import below libraries.

from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
import time
  • Related