I am reading HTML
using Beautiful Soup
. I have ran the command soup.find_all("span",{"class":"budget-list__data__number budget-list__number show-for-medium"})
and obtain:
[<span >
4 000 €
<span >24 <span >votes</span></span>
</span>, <span >
25 000 €
<span >24 <span >votes</span></span>
</span>, <span >
14 000 €
<span >23 <span >votes</span></span>
</span>, <span >
35 000 €
.
.
.
I am interested in keeping only the elements that include monetary amounts (e.g: 4 000 euros, etc) but ignoring the bits of code included in <span >
. I thought about using span.clear()
but that does not do the trick. Do you have any suggestions?
CodePudding user response:
Try:
spans = soup.find_all(
"span",
{"class": "budget-list__data__number budget-list__number show-for-medium"},
)
for span in spans:
print(span.contents[0].strip())
Prints:
4 000 €
25 000 €
14 000 €
35 000 €