This is my first time using beautifulsoup
as a scraper tool and I just follow thru slowly with each step.
I've used soup.find_all("div", class_="product-box__inner")
find a list of element I want and this partiful stuff not going thru my mind right now. my question below,
here is the HTML and my target is "$0" and I have tried
element.find("span", title= re.compile("$"))
and I can't use element.select("dt > dd > span > span")
because there's multiple one with same tag format which I dont need at all, Is there way I can target span data-fees-annual-value="" to get .text working?
<div >
<dt >Annual fee</dt>
<dd >
<span>
<span data-fees-annual-value="">$0</span>
</span>
</dd>
</div>
CodePudding user response:
If you want to find element by text, use string
instead of title
:
element.find("span", string=re.compile('$'))
Output:
<span data-fees-annual-value="">$0</span>
CodePudding user response:
You are close to your goal with css selectors
and they could be used more specific and reference directly on the attribute data-fees-annual-value
:
soup.select_one('span[data-fees-annual-value]').text
Example
from bs4 import BeautifulSoup
html="""
<div >
<dt >Annual fee</dt>
<dd >
<span>
<span data-fees-annual-value="">$0</span>
</span>
</dd>
</div>
"""
soup=BeautifulSoup(html,"html.parser")
soup.select_one('span[data-fees-annual-value]').text
Output
$0