for example I have this html
<div >a</div>
<div >b</div>
<div >c</div>
<div >aaaaaa</div>
...... item-x keep increasing randomly on it class
<div >aaaaaa</div>
I want to scrap all of the class item-X where the value of X is between 5 to 10
I know how to search with a partial class name
text = soup.select('div[class*="item-"]')
but I don't know how to add conditions for it
CodePudding user response:
You can simply use for loop.
import bs4 as bs
html = """
<div >a</div>
<div >b</div>
<div >c</div>
<div >aaaaaa</div>
<div >aaaaaa</div>
"""
soup = bs.BeautifulSoup(html, 'lxml')
for i in range(5, 10):
text = soup.select('div[class*="item-' str(i) '"]')
if text:
print(text)
CodePudding user response:
You can use multiple CSS selectors joined by ,
:
html_doc = """\
<div >a</div>
<div >b</div>
<div >c</div>
<div >aaaaaa</div>
<div >aaaaaa</div>
"""
soup = BeautifulSoup(html_doc, "html.parser")
texts = soup.select(",".join(f"div.item-{i}" for i in range(5, 11)))
for text in texts:
print(text)
Prints:
<div >c</div>
<div >aaaaaa</div>