Home > Blockchain >  How to scrape weight of player which is a hidden content using Selenium
How to scrape weight of player which is a hidden content using Selenium

Time:04-30

I have been trying to get the weight of this player, but it is only visible on the inspect element and not on the website, when I print the below, I get a blank result, Could anyone please help me with this?

Code trials:

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service

s = Service('/Users/karim/Desktop/chromedriver-2')
driver = webdriver.Chrome(service=s)
url = 'https://www.premierleague.com/players/71432/Rayan-Aït-Nouri/overview'
driver.get(url)

g = driver.find_element(By.XPATH, "//li[@class='u-hide']")
print(g.text)

CodePudding user response:

You need to do the following:

  1. Click on accept cookies button.
  2. Click on the close button that is on the top right-hand side.

Code:

driver.maximize_window()
wait = WebDriverWait(driver, 30)

driver.get("https://www.premierleague.com/players/71432/Rayan-Aït-Nouri/overview")

try:
    wait.until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Accept All Cookies']"))).click()
    print("clicked on accept cookies button")
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a#advertClose"))).click()
except:
    pass

#print(wait.until(EC.presence_of_element_located((By.XPATH, "//li[@class='u-hide']"))).get_attribute('innerText'))
print(wait.until(EC.presence_of_element_located((By.XPATH, "//li[@class='u-hide']//div[@class='info']"))).get_attribute('innerText'))

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Output:

clicked on accept cookies button
70kg

Process finished with exit code 0

CodePudding user response:

The weight of the player is within:

<div >70kg</div>

which is within it's ancestor:

<li >
          <div >Weight</div>
          <div >70kg</div>
    </li>
    

where the <li> is having class as u-hide

As the element is a hidden element using text attribute won't work here and you have extract either the innerHTML or the innerText.


Solution

To print the value of the Weight you can use either of the following locator strategies:

  • Using innerHTML:

    print(driver.find_element(By.XPATH, "//li[@class='u-hide']//div[@class='info']").get_attribute('innerHTML'))
    
  • Using innerText:

    print(driver.find_element(By.XPATH, "//li[@class='u-hide']//div[@class='info']").get_attribute('innerText'))
    

Inducing WebDriverWait:

  • Using innerHTML:

    print(WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//li[@class='u-hide']//div[@class='info']"))).get_attribute('innerHTML'))
    
  • Using innerText:

    print(WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//li[@class='u-hide']//div[@class='info']"))).get_attribute('innerText'))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Related