Home > Software engineering >  XPath Prints Same Value Inncorrectly
XPath Prints Same Value Inncorrectly

Time:03-08

My code goes into a webpage and scrapes the data per each element/block.

However, each element has multiple classes with same names, which makes the XPath repeat the same value.

For example Author and Session name have the same class names.

How do I use xpath when the class names are the same?

from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://index.mirasmart.com/aan2022/SearchResults.php?pg=1')
page_source = driver.page_source

element = driver.find_elements_by_xpath('.//div[@]')
for el in element:
    author=el.find_element_by_xpath('.//span[@]').text
    sessionName=el.find_element_by_xpath('.//span[@]').text
    print(author,sessionName)

CodePudding user response:

Try like below once and confirm. All the details are in a <p> tag, can get the respective tag with indexing.

driver.get("https://index.mirasmar  t.com/aan2022/SearchResults.php?pg=1")

# Collecting all the options.
elements = driver.find_elements(By.XPATH,"//div[contains(@class,'search-results-list')]/div")

for element in elements:
    author = element.find_element(By.XPATH,"./div//p[1]") # The first <p> contains Author details.
    print(author.get_attribute("innerText")) 

    session = element.find_element(By.XPATH,"./div//p[2]")# The second <p> contains Session details.
    print(session.get_attribute("innerText"))

Output

Author: Rachel Pauley  Levi Dygert  Aaron Nelson  Heather Lau  
Session Name:   P8: Infectious Disease: Bacteria, Fungi, and Parasites on the Mind and Body 1  
Author: Aaron S Zelikovich  Eric C Lawson  Giana Dawod  Dylan Del Papa  Mikel Shea Ehntholt  Evan Kolesnick  Jaclyn Martindale  Oluwasinmisola Opeyemi  Alexandria Pecoraro  Stephanie Reyes  Andrew Yoo  Aaron L Berkowitz  Matthew S Robbins  
Session Name:   
Author: Jonathan Morena  
Session Name:   P16: MS Clinical Assessments &  Outcome Measures  
Author: Gabriela Figueiredo Pucci  Tara Samiee  Natalie Sholl  Shreya Louis  Adnan Husein  Theandra Madu  Carolina Rodriguez Rivera  Jenny Rotblat  
Session Name:   
Author: Claudia Janoschka  Marisol Herrera-Rivero  Lisa Gerdes  Kathrin Koch  Heinz Wiendl  Reinhard Hohlfeld  Monika Stoll  Luisa Klotz  
Session Name:   

CodePudding user response:

Surrounding elements can be used to give the XPath expression more precision

Author:
It reads "get the span descendant of a p element that has a strong descendant with text()=Author:"

//p[strong[.="Author:"]]/span

Session:

//p[strong[.="Session Name:"]]/span

Or

//div[@]/div[@]/p[strong[.="Session Name:"]]/span
  • Related