I need to get text from a html using python, but in html has 2 elements with the same class name and-CodePudding

I have a html like:

<div class='mesage-in'> cool text here </div>
<div class='mesage-in'> bad text here </div>

and my python code like:

texto = navegador.find_element_by_class_name('message-in').text
print(texto)

its possible make this get all elements with same class name and put on a array or define as different variable? like this

OutPut:

print(texto1)

-> cool text here

print(texto2)

-> bad text here

#or

print(texto[0])

-> cool text here

print(texto[1])

-> bad text here

actualy my code only get the first one

CodePudding user response：

You can achieve this by using BeautifulSoup library.

example output :

[' cool text here ', ' bad text here ']

from bs4 import BeautifulSoup

def get_class_texts(html_text: str, class_name: str):
    soup = BeautifulSoup(html_text, features="html.parser")
    return [tag.text for tag in soup.select(f".{class_name}")]

print(get_class_texts("<div class='mesage-in'> cool text here </div> <div class ='mesage-in'> bad text here </div>", "mesage-in"))

CodePudding user response：

To get multiple elements into one array you need to use find_elements. In your case I would use xpath like so:

eleArray = self.driver.find_elements(By.XPATH, '//div[@class='mesage-in']');

Then, you can loop over the array like so:

for element in eleArray:
    print(element.text)

Here is a similar example where I get all latin encodeable span elements from wikipedia and log them to a console. Feel free to run it and see the results (this product is free to use by the way, so you can transfer the test case and make your own account via google login): https://mx1.maxtaf.com/cases/a320a3ad-9949-4bce-87fa-7a0980df8f1f?projectId=bugtestproject2

CodePudding user response：

You can store them into a list. That list will be a list of web elements.

As I see, you are using navegador.find_element which will return a single web element.

Whereas navegador.find_elements will return a list of web elements.

Also, in latest Selenium find_element_by_class_name have been deprecated therefore I would suggest you to use navegador.find_element(By.CLASS_NAME, "")

Code:

texto = navegador.find_elements(By.CLASS_NAME, 'message-in')
print(texto[0])
print(texto[1])

or 

for txt in texto:
    print(txt.text)