Home > Software engineering >  How to parse with BeautifuSoup Python?
How to parse with BeautifuSoup Python?

Time:10-17

For example I have code like this

<div class="container">
    <div class="blablabla1">
        <div class="blablabla2">
            <div class="blablabla3">
                <span class="hello">Hello</span>
            </div>
        </div>
    </div>
</div>

How can I get <span> value or <span> class value?

Should I firstly find all containers?

CodePudding user response:

Your question is not that clear but in generell you can access the class and text with css selector like this:

from bs4 import BeautifulSoup

html = '''
<div >
    <div >
        <div >
            <div >
                <span >Hello</span>
            </div>
        </div>
    </div>
</div>
'''

soup = BeautifulSoup(html, "lxml")

spanText = soup.select_one('div.container span').text
spanClass = soup.select_one('div.container span')['class']

CodePudding user response:

Once you have obtain the soup, you can use the find_all() method to find all <span> with the hello class:

all_hello_spans = soup.find_all({"span":{"class":"hello"}}})

  • Related