Home > database >  How to scrape just one text value on one p tag from bs4
How to scrape just one text value on one p tag from bs4

Time:01-31

Actually The website has one <p> but inside it there are two text values, I just want to scrape one of the texts. website HTML as below:

<p  xpath="1">
                        Great Clips

                                                    <br><span >Request Info</span>
                                            </p>

On HTML above, there are two text values ("Great Clips" & "Request Info")if we target <p>. I just want to scrape "Great Clips" not both, how would I do that with bs4?

CodePudding user response:

You could use .contents with indexing to extract only the first child:

soup.p.contents[0].strip()

Example

from bs4 import BeautifulSoup

html = '''
<p  xpath="1">
                        Great Clips

                                                    <br><span >Request Info</span>
                                            </p>
'''
soup = BeautifulSoup(html)

soup.p.contents[0].strip()

Output

Great Clips
  • Related