I'm using selenium in Python to try and scrape multiple pages. ID's and XPATH's keep changing per page, so I figured I'd best access them through their attribute-value combinations (see below).
I'm trying to access the text in the following element: https://i.stack.imgur.com/ly1YU.png which belongs to the following: https://i.stack.imgur.com/strep.png
As I said, the ID's keep changing, so I wanted to access the element by data-fragment-name="articleDetail", or data-testid = "article-body". Can somebody help me how to do so?
Thanks in advance!
CodePudding user response:
Try using the following CSS_SELECTOR
div[data-fragment-name='articleDetail'] div[data-testid='article-body']
Or XPath
//div[@data-fragment-name='articleDetail']//div[@data-testid='article-body']
The Selenium command can look like:
driver.find_element(By.CSS_SELECTOR, "div[data-fragment-name='articleDetail'] div[data-testid='article-body']")
Or
driver.find_element(By.XPATH, "//div[@data-fragment-name='articleDetail']//div[@data-testid='article-body']")
CodePudding user response:
from selenium.webdriver.common.by import By
obj = driver.find_element(By.XPATH, "//div[@data-fragment-name='articleDetail']")
obj2 = driver.find_element(By.XPATH, "//div[@data-testid='article-body']")
where of course driver = webdriver.Firefox()
or something like that and you already moved to the desired page.