I'm trying to get the text out of a parent element, excluding the text from childs element from a webpage that has a structure like this:
<div >
"Apples"
<span >"Bananas"</span>
</div>
The text that is of interest to me is "Apples". The Xpath selector //*[@class='parent']/text()[last()]
works great in the browser, but I get an error saying Message: invalid selector: The result of the xpath expression "//*[@class='parent']/text()[last()]" is: [object Text]. It should be an element.
, when I try to obtain it with Selenium in Python like this:
driver.find_element(By.XPATH, ("//*[@class='parent']/text()[last()]")).text()
To sum it up my goal is to get the string "Apples" returned to me, as of now I was only successful in getting a string like "ApplesBananas". The string itself is not predictable so filtering based on contains()
is out of the question.
CodePudding user response:
This can not be done with direct XPath locator only.
What you need to do here is:
Get the parent element text (it will include parent element text content and child element text contents).
Then remove child element text contents, as following:
all_text = driver.find_element(By.XPATH, ("//*[@class='parent']")).text
child_text = driver.find_element(By.XPATH, ("//*[@class='parent']//*")).text
parent_text = all_text.replace(child_text, '')
In case there are multiple child nodes replacing the child text content should be done for all those nodes, as following:
parent_text = ""
all_text = driver.find_element(By.XPATH, ("//*[@class='parent']")).text
child_elements = driver.find_elements(By.XPATH, ("//*[@class='parent']//*"))
for child_element in child_elements:
parent_text = all_text.replace(child_element.text, '')
print(parent_text)