I want to get all child element text but skip some child element text under certain condition which is the and part in Selenium using Python language
Source Code
<div >
<div >
ゆえありげな
<ruby>
<rb>品</rb>
<rp>(</rp>
<rt roma="sina" hiragana="しな">しな</rt>
<rp>)</rp>
</ruby>
"。"
</div>
<div >似有来历的物品。</div>
</div>
Expected Result
ゆえありげな品。
似有来历的物品。
Here is the website example: https://www.mojidict.com/details/198951091?notationMode=1
CodePudding user response:
Try this:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
driver.get("https://www.mojidict.com/details/198951091?notationMode=1")
time.sleep(5)
total_text_element = driver.find_element_by_xpath("(//div[@class='example-info'])[8]")
total_text = total_text_element.text
undesired_text_element = driver.find_element_by_xpath("(//div[@class='example-info'])[8]/div/ruby/rt")
undesired_text = undesired_text_element.text
desired_text = total_text.replace(undesired_text, "")
desired_text = desired_text.replace("\n\n", "")
print(desired_text)
Please note that I have done this for a specific element and not all the elements. If you want to apply it to all the matching cases of the given x-path
, then you have to incorporate some additional changes. First change will be using find_elements_by_xpath
instead of find_element_by_xpath
.