Home > other >  Selenium : Get text inside an element but not texts inside the nested tags within it
Selenium : Get text inside an element but not texts inside the nested tags within it

Time:07-06

Lets say I have an element

<div >
    ₹199 
    <span >
        ₹690
    </span>
    <div >
        No discount available for this product
    </div>
</div>

When I am fetching the element by classname

div_containing_radio = driver.find_element(by=By.XPATH, value="//div[starts-with(@class, 'ProductVariants__RadioButtonInner')]//ancestor::div[starts-with(@class, 'ProductVariants__VariantCard')]")
div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text

This gives me

'₹199 ₹690 No discount available for this product'

What I wanted was just ₹199.

Note that I can't just format the text and get the first text on split by space as the structure of the page keeps changing.

CodePudding user response:

Using little bit JS:

js_query = """
            var x = document.querySelector('.ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua').childNodes;
            var l = "";
    
            x.forEach(i => {
                if (i.nodeName === '#text') {
                    l  = ' '   i.textContent;
                }
            });
            return l;
"""

price = driver.execute_script(js_query).strip()
print(price)

Output:

₹199

What we are doing with JS is we are fetching all the child nodes of our target div element. Then we are iterating through all of these nodes and getting textContent values from text nodes only. Simultaneously, we are adding all those values into a string type variable l. We return l from JS and strip it off of useless characters in Python. That's it.

CodePudding user response:

Answer of @Firelord ( 1) can be simplified as

div_containing_radio = driver.find_element(by=By.XPATH, value="//div[starts-with(@class, 'ProductVariants__RadioButtonInner')]//ancestor::div[starts-with(@class, 'ProductVariants__VariantCard')]")
price = div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua")

print(driver.execute_script("return arguments[0].firstChild.textContent;", price).strip())

CodePudding user response:

To print only 199 from the string ₹199 ₹690 No discount available for this product you just need to split the entire string with respect to the and print the second element as follows:

print(div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text.split("₹")[1])

As an alternative you can also split the string with respect to the blankspace and print the first element as follows:

print(div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text.split(" ")[0])    

CodePudding user response:

Try these:

div_containing_radio = driver.find_element(by=By.XPATH,"//div[starts-with(@class, 'ProductVariants__PriceContainer-sc-1unev4j-9 jjiIua')]/following-sibling::text()")
  • Related