Home > Blockchain >  How to get text from a html using Selenium and Python which has two elements with the same classname
How to get text from a html using Selenium and Python which has two elements with the same classname

Time:03-29

I have a html like:

<div class='mesage-in'> cool text here </div>
<div class='mesage-in'> bad text here </div>

and my python code like:

texto = navegador.find_element_by_class_name('message-in').text
print(texto)

Is it possible make this get all elements with same class name and put on a array or define as different variable like this?

OutPut:

print(texto1)

-> cool text here

print(texto2)

-> bad text here

#or

print(texto[0])

-> cool text here

print(texto[1])

-> bad text here

Actualy my code only get the first one

CodePudding user response:

As per the HTML:

<div class='mesage-in'> cool text here </div>
<div class='mesage-in'> bad text here </div>

The following line line of code:

texto = navegador.find_element_by_class_name('message-in').text

will always identify the first matching element, extract the text and assign it to texto. So when you try to print texto, the text of the very first element i.e. cool text here is printed.


Solution

You can get all elements with same classname i.e. mesage-in and put on a list as follows:

from selenium.webdriver.common.by import By
texto = navegador.find_elements(By.CLASS_NAME, 'message-in')

Now you can print the desired texts with respect to their index as follows:

  • To print cool text here:

    print(texto[0].text) # prints-> cool text here
    
  • To print bad text here:

    print(texto[1].text) # prints-> bad text here
    

Outro

You can also crate a list of the texts using List Comprehension and print them as follows:

texto = [my_elem.text for my_elem in driver.find_elements(By.CLASS_NAME, "message-in")]
print(texto[0]) # prints-> cool text here
print(texto[1]) # prints-> bad text here

CodePudding user response:

You can achieve this by using BeautifulSoup library.

example output :

[' cool text here ', ' bad text here ']

from bs4 import BeautifulSoup

def get_class_texts(html_text: str, class_name: str):
    soup = BeautifulSoup(html_text, features="html.parser")
    return [tag.text for tag in soup.select(f".{class_name}")]

print(get_class_texts("<div class='mesage-in'> cool text here </div> <div class ='mesage-in'> bad text here </div>", "mesage-in"))

CodePudding user response:

To get multiple elements into one array you need to use find_elements. In your case I would use xpath like so:

eleArray = self.driver.find_elements(By.XPATH, '//div[@class='mesage-in']');

Then, you can loop over the array like so:

for element in eleArray:
    print(element.text)

Here is a similar example where I get all latin encodeable span elements from wikipedia and log them to a console. Feel free to run it and see the results (this product is free to use by the way, so you can transfer the test case and make your own account via google login): https://mx1.maxtaf.com/cases/a320a3ad-9949-4bce-87fa-7a0980df8f1f?projectId=bugtestproject2

CodePudding user response:

You can store them into a list. That list will be a list of web elements.

As I see, you are using navegador.find_element which will return a single web element.

Whereas navegador.find_elements will return a list of web elements.

Also, in latest Selenium find_element_by_class_name have been deprecated therefore I would suggest you to use navegador.find_element(By.CLASS_NAME, "")

Code:

texto = navegador.find_elements(By.CLASS_NAME, 'message-in')
print(texto[0])
print(texto[1])

or 

for txt in texto:
    print(txt.text)
  • Related