Please tell me how to extract from the entire code and write to the variable only the text 'NEEEED'
<div id="statsBlock">
<h1>JustText</h1>
<div >
<div >HelloText<strong>OtherText</strong></div>
<div >JustText <strong>NEEEED</strong></div>
<div >hello</div>
<h2>otherText.</h2>
Code:
soup.find_all("div", {"class": "firmInfo"})
But what to do next? How to extract only the second (the NEEEED text) and write it to a variable?
CodePudding user response:
You can try this
html='''
<div id="statsBlock">
<h1>JustText</h1>
<div >Hey</div>
<div >HelloText<strong>OtherText</strong></div>
<div >JustText <strong>NEEEED</strong></div>
<div >hello</div>
<h2>otherText.</h2>
'''
from bs4 import BeautifulSoup
soup= BeautifulSoup(html,'html.parser')
txt=soup.select_one('#statsBlock h1 div div div strong').text
print(txt)
Output:
NEEEED
CodePudding user response:
It's the direct child strong
element of a parent element with class firmInfo
. That parent is the only even numbered child element sharing that class so you can use :nth-child(even)
. Anchor the selector list with a leftmost id selector for ancestor.
from bs4 import BeautifulSoup as bs
html='''
<div id="statsBlock">
<h1>JustText</h1>
<div >Hey</div>
<div >HelloText<strong>OtherText</strong></div>
<div >JustText <strong>NEEEED</strong></div>
<div >hello</div>
<h2>otherText.</h2>
'''
soup = bs(html, 'lxml') # or 'html.parser'
print(soup.select_one('#statsBlock .firmInfo:nth-child(even) > strong').text)