Home > OS >  Is it possible to extract text from span with Beautiful Soup?
Is it possible to extract text from span with Beautiful Soup?

Time:09-01

<span>
 <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 36 36" width="16" height="16" 
  >
  <path fill-rule="evenodd" d="M9 19.5V7.25A4.25 4.25 0 0113.25 
  3a1 1 0 010 2A2.25 2.25 0 0011 7.25V19.5h3a3.5 3.5 0 013.752-3.491 4.5 4.5 0 017.496 
  0A3.5 3.5 0 0129 19.5h2a1 1 0 010 2h-.8l.14 1.403a6 6 0 01-5.46 6.575l.569 1.706a1 1 0 
  11-1.898.632l-.683-2.051a1.001 1.001 0 01-.05-.265h-9.635a1.001 1.001 0 
  01-.05.265l-.684 2.051a1 1 0 11-1.898-.632l.569-1.705a6 6 0 01-5.46-6.576L5.8 21.5H5a1 
  1 0 010-2h4zm7 0h11a1.5 1.5 0 00-1.969-1.426l-.87.286-.361-.842a2.5 2.5 0 00-4.6 
  0l-.36.842-.871-.286A1.5 1.5 0 0016 19.5zm-8.19 2l-.16 1.602a4 4 0 003.98 
  4.398h12.74a4 4 0 003.98-4.398l-.16-1.602H7.81zm6.86-11.949a1 1 0 01-2 0V7.5a1 1 0 112 
  0v2.051zm3.12- 
  2.316a1 1 0 01-1.897.633l-.684-2.052a1 1 0 011.898-.632l.684 2.051z"></path>
 </svg>
 1 baño
</span>

How can I extract the text '1 baño' using find_all from BeautifulSoup?

CodePudding user response:

Find <span> tag and use .text property:

from bs4 import BeautifulSoup


html = """\
<span>
 <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 36 36" width="16" height="16" 
  >
  <path fill-rule="evenodd" d="M9 19.5V7.25A4.25 4.25 0 0113.25 
  3a1 1 0 010 2A2.25 2.25 0 0011 7.25V19.5h3a3.5 3.5 0 013.752-3.491 4.5 4.5 0 017.496 
  0A3.5 3.5 0 0129 19.5h2a1 1 0 010 2h-.8l.14 1.403a6 6 0 01-5.46 6.575l.569 1.706a1 1 0 
  11-1.898.632l-.683-2.051a1.001 1.001 0 01-.05-.265h-9.635a1.001 1.001 0 
  01-.05.265l-.684 2.051a1 1 0 11-1.898-.632l.569-1.705a6 6 0 01-5.46-6.576L5.8 21.5H5a1 
  1 0 010-2h4zm7 0h11a1.5 1.5 0 00-1.969-1.426l-.87.286-.361-.842a2.5 2.5 0 00-4.6 
  0l-.36.842-.871-.286A1.5 1.5 0 0016 19.5zm-8.19 2l-.16 1.602a4 4 0 003.98 
  4.398h12.74a4 4 0 003.98-4.398l-.16-1.602H7.81zm6.86-11.949a1 1 0 01-2 0V7.5a1 1 0 112 
  0v2.051zm3.12- 
  2.316a1 1 0 01-1.897.633l-.684-2.052a1 1 0 011.898-.632l.684 2.051z"></path>
 </svg>
 1 baño
</span>"""

soup = BeautifulSoup(html, "html.parser")

print(soup.find("span").text.strip())

Prints:

1 baño

EDIT: If the <span> contains always the word baño you can use:

tag = soup.find(lambda tag: tag.name == "span" and "baño" in tag.text)
print(tag.text.strip())
  • Related