Home > Software engineering >  Cannot find value from span using Python and Selenium
Cannot find value from span using Python and Selenium

Time:01-06

I want to obtain the value of the comment from the comment box, which is only displayed if a user has left a comment.

<div data-v-410f78e0="" ><span data-v-410f78e0="" >Message: </span> <div data-v-410f78e0="" ><div data-v-410f78e0="" ><span data-v-410f78e0=""> 603-779 852</span></div> <!----></div></div>

I'm trying to get 603-779 852 from there. I tried this

 from bs4 import BeautifulSoup

 #Parse the HTML using Beautiful Soup
 soup = BeautifulSoup(html, "html.parser")

 #Find the element containing the string you want to extract
 element = soup.find("div", class_="comment-desc comment-desc-inline")

 #Extract the string from the element and remove any leading or trailing white space
 string = element.text.strip()

 #Remove the characters " ", "-", and space from the string
 modified_string = string.replace(" ", "").replace("-", "").replace(" ", "")

 #Slice the first character (index 0) from the modified string and remove it if it contains the character "6"
 first_char = modified_string[0:1].replace("6", "")

 #Verify that the resulting string starts with the character "0"
 if first_char.startswith("0"):
   final_string = first_char   modified_string[1:]
 else:
   final_string = modified_string

 #Print the final string
 print(final_string)

CodePudding user response:

I can guarantee you this isn't the best way to do this. But only half my brain is working today, so this is all I've got.

As they say "If it's stupid, but works, it ain't stupid".

To fix your code, you can simply change:

string = element.text.strip()

To:

string = element.contents[0].contents[0]

Complete Code:

from bs4 import BeautifulSoup as bs

html = """<div data-v-410f78e0="" ><span data-v-410f78e0="" >Message: </span> <div data-v-410f78e0="" ><div data-v-410f78e0="" ><span data-v-410f78e0=""> 603-779 852</span></div> <!----></div></div>"""
soup = bs(html, "html.parser")

#Find the element containing the string you want to extract
element = soup.find("div", class_="comment-desc comment-desc-inline")

#Extract the string from the element and remove any leading or trailing white space
string = element.contents[0].contents[0]

#Remove the characters " ", "-", and space from the string
modified_string = string.replace(" ", "").replace("-", "").replace(" ", "")

#Slice the first character (index 0) from the modified string and remove it if it contains the character "6"
first_char = modified_string[0:1].replace("6", "")

#Verify that the resulting string starts with the character "0"
if first_char.startswith("0"):
    final_string = first_char   modified_string[1:]
else:
    final_string = modified_string

#Print the final string
print(final_string)

Will output:

603779852

  • Related