Home > Software design >  Selecting 4th child div using BeautifulSoup
Selecting 4th child div using BeautifulSoup

Time:11-07

I have a 5th child div under main div which id is main_div but child div's has no id or class.

I am trying to get the text from 4th child "div text 04"

Here is my html:

<div id="main_div">
    <div>div 01</div>
    <div>div 02</div>
    <div>div 03</div>
    <div>div text 04</div>
    <div>div 05</div>
</div>

I'm trying with this but it's not working because there is no class in child div. How can I get 4th child div text?

soup = bs(r.text, 'html.parser')
html_soup = soup.find('div', {"id": 'main'})

Thanks

CodePudding user response:

If you're sure the structure won't change and all you want it the 4th div then try this, for example:

from bs4 import BeautifulSoup

sample_html = """<div id="main_div">
    <div>div 01</div>
    <div>div 02</div>
    <div>div 03</div>
    <div>div text 04</div>
    <div>div 05</div>
</div>"""

soup = (
    BeautifulSoup(sample_html, 'html.parser')
    .find(id='main_div')
    .find_all('div')[-2]
    .text
)
print(soup)

Or use CSS selector:

soup = BeautifulSoup(sample_html, 'html.parser')
parent = soup.find(id="main_div")
# assign child value
n = 4
print(parent.select_one("div:nth-of-type("   str(n)   ")").getText())

Output:

div text 04
  • Related