Actually The website has one <p>
but inside it there are two text values, I just want to scrape one of the texts. website HTML as below:
<p xpath="1">
Great Clips
<br><span >Request Info</span>
</p>
On HTML above, there are two text values ("Great Clips" & "Request Info")if we target <p>
. I just want to scrape "Great Clips" not both, how would I do that with bs4
?
CodePudding user response:
You could use .contents
with indexing to extract only the first child:
soup.p.contents[0].strip()
Example
from bs4 import BeautifulSoup
html = '''
<p xpath="1">
Great Clips
<br><span >Request Info</span>
</p>
'''
soup = BeautifulSoup(html)
soup.p.contents[0].strip()
Output
Great Clips