Home > database >  How to get the header info of the first news on the page using Selenium
How to get the header info of the first news on the page using Selenium

Time:12-29

I am trying to navigate to each of the links and get their first news on the page.but can't get a unique xpath for the header info across all pages

Refer the attached pic, it shows 2 elements. how can i make it as unique to fetch only the header / title news. enter image description here

CodePudding user response:

You can select only the first element by using the parent's first child and then point to the child element, for example:

//div/div[1]/div/a/h3[@]

[1] is equal to :first-child

CodePudding user response:

To retrieve the header of the first news you need to induce WebDriverWait for the visibilityOfElementLocated() and you can use either of the following Locator Strategies:

  • Java and xpath and getText():

    System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//div[contains(@class, 'gs-u-display-inline-block@m')]//h3[@class='gs-c-promo-heading__title gel-paragon-bold nw-o-link-split__text']"))).getText());
    
  • Python and css_selector and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div[class*='gs-u-display-inline-block@m'] h3.gs-c-promo-heading__title.gel-paragon-bold.nw-o-link-split__text"))).text)
    
  • Console Output:

    Russia orders oldest rights group Memorial to shut
    
  • Note : For Python clients you have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

CodePudding user response:

There are many headers on that page and they are changing. I mean some news are coming and being removed from this page.
If you want to get all the news title texts you can get all the headers elements and then iterate over them and extract their texts.
Something like this:

List<WebElement> headers = driver.findElements(By.xpath("//h3[@class='gs-c-promo-heading__title gel-pica-bold nw-o-link-split__text']"));
for (WebElement header : headers){    
    ((JavascriptExecutor) driver).executeScript("arguments[0].scrollIntoView(true);", header);
    System.out.println(header.getText());
}
  • Related