Home > Net >  Scraping Multiple urls selenium
Scraping Multiple urls selenium

Time:09-26

I'm new to coding but i wrote this code that scraps the page fine but i want to scrape multiple of these urls like 200 how do i do that?

from selenium import webdriver

chrome_path = r"C:\Users\lenovo\Downloads\chromedriver_win32 (5)\chromedriver.exe"

driver = webdriver.Chrome(chrome_path)

driver.get("https://www.kijijiautos.ca/vip/22442312")

driver.find_element_by_xpath('//div[@class="b1yLWE b3zFtQ"]').text

btn = driver.find_element_by_xpath('//button[@class="g1zAe-"]')

btn.click()

driver.find_elements_by_xpath('//span[@class="A2jAym q2jAym"]').text

driver.find_element_by_xpath('//div[@class="b1yLWE b1zAe-"]').text

print(driver.current_url)

CodePudding user response:

Something like below

from selenium import webdriver

chrome_path = r"C:\Users\lenovo\Downloads\chromedriver_win32 (5)\chromedriver.exe"

driver = webdriver.Chrome(chrome_path)


def get_scarping(link):
    driver.get(link)
    driver.find_element_by_xpath('//div[@class="b1yLWE b3zFtQ"]').text
    btn = driver.find_element_by_xpath('//button[@class="g1zAe-"]')
    btn.click()
    driver.find_elements_by_xpath('//span[@class="A2jAym q2jAym"]').text
    driver.find_element_by_xpath('//div[@class="b1yLWE b1zAe-"]').text
    print(driver.current_url)
    return driver.current_url 


links = ["https://www.kijijiautos.ca/vip/22442312", "other_urls"]
scrapings = []
for link in links:
    scrapings.append(get_scarping(link))

CodePudding user response:

Just add for loop

from selenium import webdriver
chrome_path = r"C:\Users\lenovo\Downloads\chromedriver_win32 (5)\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
for x in range(200):
    driver.get("https://www.kijijiautos.ca/vip/22442312")
    driver.find_element_by_xpath('//div[@class="b1yLWE b3zFtQ"]').text
    btn = driver.find_element_by_xpath('//button[@class="g1zAe-"]')
    btn.click()
    driver.find_elements_by_xpath('//span[@class="A2jAym q2jAym"]').text
    driver.find_element_by_xpath('//div[@class="b1yLWE b1zAe-"]').text
    print(driver.current_url)
  • Related