Home > Software design >  How to scrape Next button on Linkedin with Selenium using Python?
How to scrape Next button on Linkedin with Selenium using Python?

Time:12-01

I am trying to scrape LinkedIn website using Selenium. I can't parse Next button. It resists as much as it can. I've spent a half of a day to adress this, but all in vain.

I tried absolutely various options, with text and so on. Only work with start ID but scrape other button.

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//button[@aria-label='Далее']"} 

This is quite common for this site:

//*[starts-with(@id,'e')]

My code:

from selenium import webdriver
from selenium.webdriver import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from time import sleep



chrome_driver_path = Service("E:\programming\chromedriver_win32\chromedriver.exe")
driver = webdriver.Chrome(service=chrome_driver_path)
url = "https://www.linkedin.com/feed/"
driver.get(url)
SEARCH_QUERY = "python developer"
LOGIN = "EMAIL"
PASSWORD = "PASSWORD"
sleep(10)

sign_in_link = driver.find_element(By.XPATH, '/html/body/div[1]/main/p[1]/a')
sign_in_link.click()

login_input = driver.find_element(By.XPATH, '//*[@id="username"]')
login_input.send_keys(LOGIN)
sleep(1)
password_input = driver.find_element(By.XPATH, '//*[@id="password"]')
password_input.send_keys(PASSWORD)
sleep(1)
enter_button = driver.find_element(By.XPATH, '//*[@id="organic-div"]/form/div[3]/button')
enter_button.click()
sleep(25)

lens_button = driver.find_element(By.XPATH, '//*[@id="global-nav-search"]/div/button')
lens_button.click()
sleep(5)

search_input = driver.find_element(By.XPATH, '//*[@id="global-nav-typeahead"]/input')
search_input.send_keys(SEARCH_QUERY)
search_input.send_keys(Keys.ENTER)
sleep(5)

people_button = driver.find_element(By.XPATH, '//*[@id="search-reusables__filters-bar"]/ul/li[1]/button')
people_button.click()
sleep(5)

page_button = driver.find_element(By.XPATH, "//button[@aria-label='Далее']")
page_button.click()

sleep(60)

Chrome inspection of button Next Button

CodePudding user response:

OK, there are several issues here:

  1. The main problem why your code not worked is because the "next" pagination is initially even not created on the page until you scrolling the page, so I added the mechanism, to scroll the page until that button can be clicked.
  2. it's not good to create locators based on local language texts.
  3. You should use WebDriverWait expected_conditions explicit waits, not hardcoded pauses.

I used mixed locators types to show that sometimes it's better to use By.ID and sometimes By.XPATH etc.
the following code works:

import time

from selenium import webdriver
from selenium.webdriver import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument("start-maximized")

webdriver_service = Service('C:\webdrivers\chromedriver.exe')
driver = webdriver.Chrome(options=options, service=webdriver_service)
wait = WebDriverWait(driver, 10)

url = "https://www.linkedin.com/feed/"
driver.get(url)

wait.until(EC.element_to_be_clickable((By.XPATH, "//a[contains(@href,'login')]"))).click()
wait.until(EC.element_to_be_clickable((By.ID, "username"))).send_keys(my_email)
wait.until(EC.element_to_be_clickable((By.ID, "password"))).send_keys(my_password)
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[type='submit']"))).click()
search_input = wait.until(EC.element_to_be_clickable((By.XPATH, "//input[contains(@class,'search-global')]")))
search_input.click()
search_input.send_keys("python developer"   Keys.ENTER)
wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="search-reusables__filters-bar"]/ul/li[1]/button'))).click()
wait = WebDriverWait(driver, 4)
while True:
    try:
        next_btn = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "button.artdeco-pagination__button.artdeco-pagination__button--next")))
        next_btn.location_once_scrolled_into_view
        time.sleep(0.2)
        next_btn.click()
        break
    except:
        driver.execute_script("window.scrollBy(0, arguments[0]);", 600)
  • Related