Home > database >  Selenium Get all child element text but skip some child element text under certain condition
Selenium Get all child element text but skip some child element text under certain condition

Time:03-20

I want to get all child element text but skip some child element text under certain condition which is the and part in Selenium using Python language

Source Code

<div >
    <div >
        ゆえありげな 
        <ruby>
            <rb>品</rb>
            <rp>(</rp>
            <rt roma="sina" hiragana="しな">しな</rt>
            <rp>)</rp>
        </ruby>
        "。"
    </div>
    <div >似有来历的物品。</div>
</div>

Expected Result

ゆえありげな品。
似有来历的物品。

Here is the website example: https://www.mojidict.com/details/198951091?notationMode=1

CodePudding user response:

Try this:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time


chrome_options = Options()
chrome_options.add_argument("--headless")


driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
driver.get("https://www.mojidict.com/details/198951091?notationMode=1")

time.sleep(5)

total_text_element = driver.find_element_by_xpath("(//div[@class='example-info'])[8]")
total_text = total_text_element.text

undesired_text_element = driver.find_element_by_xpath("(//div[@class='example-info'])[8]/div/ruby/rt")
undesired_text = undesired_text_element.text

desired_text = total_text.replace(undesired_text, "")
desired_text = desired_text.replace("\n\n", "")
print(desired_text)

Please note that I have done this for a specific element and not all the elements. If you want to apply it to all the matching cases of the given x-path, then you have to incorporate some additional changes. First change will be using find_elements_by_xpath instead of find_element_by_xpath.

  • Related