How to extract multiple texts from span elements using python selenium?-CodePudding

I am trying to extract all the texts in span into list, using the following HTML code from Selenium webdriver method:

['1a', '1b', '1c', '2a', ' ', ' ', '3a', '3b', '3c', '4a', ' ', ' ']

Anyone expert know how to do it?

HTML:

<tr style="background-color:#999">
    <td><b style="white-space: nowrap;">table_num</b><enter code here/td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>1a</span>
                <span>1b</span>
                <span>1c</span>
                </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>2a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
           </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>3a</span>
                <span>3b</span>
                <span>3c</span>
            </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>4a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
            </span>
        </td>
</tr>

CodePudding user response：

Here is the way, use the below xpath which will give you all the required spans.

//span[contains(@style,"column")]/span

Once you have all the span, you have to extract text from it.

If there is empty text, then ignore or else add it in the list.

CodePudding user response：

As per the HTML, to extract all the texts from the <span> elements into a list you have to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies:

Using CSS_SELECTOR and text attribute:

driver.get("application url")
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))])

Using XPATH and get_attribute("innerHTML"):

driver.get("application url")     
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))])