Home > Software design >  scrapping the web with selenium, an error
scrapping the web with selenium, an error

Time:10-08

I am trying to use selenium to get data from the page of tipranks: https://www.tipranks.com/experts/analysts/dan-payne

the code is as follows:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

driver = webdriver.Chrome('chromedriver')
url = 'https://www.tipranks.com/experts/analysts/john-pitzer'

driver.get(url)
time.sleep(30)

elements = driver.find_elements(By.CSS_CLASS,"override")

# This is text from the inspector: <text  x="50%" y="50%" fill="#1ead00" text-anchor="middle" dy="0.28em">69%</text>

for all in elements:
    print(all.text)

But it's getting empty results. I have checked with an inspector the div of the element e.g. Success Rate but still the same.

Similarly it doesn't work with XPATH if I make in inspector copy-XPATH as

#elements = driver.find_elements(By.XPATH,"/html/body/div[1]/div[2]/div[4]/div/div[3]/div[1]/div[1]/div[1]/div[2]/div[3]/div[2]/div[2]/div[2]/div[1]/div[2]/svg/g/text")

Please help me assess what I am doing wrongly, thank you.

CodePudding user response:

If you want all the text then you can use this XPath

//*[contains(@class,'override')]

and if you want 69% text then you can use this

//*[contains(@class,'override') and contains(text(),'69')]

CodePudding user response:

I am assuming you need the Success Rate percentage from that page, for that try this code:

If 'Cookie Privacy Statement' dialog appears, use the below 2 lines of code to handle it:

accept_btn = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"button#cookies_accept_btn")))
driver.execute_script("arguments[0].click();", accept_btn)

The Success Rate percentage value is inside 'svg' tag. To handle 'svg' tags, you have to use 'local-name()' in XPath like below:

data = driver.find_elements(By.XPATH,".//*[local-name()='svg']/*[local-name()='g']/*[local-name()='text']")

for percentage in data:
    print(percentage.text)
  • Related