Home > Blockchain >  Get first level li from ul with python
Get first level li from ul with python

Time:10-20

I tried multiple ways to get the first level <li> elements but i couldn´t find the correct way.

I´ve tried:

for item in soup_inicio.find_all('div',{'class':'menu'}):
    sub_items = item.find_all('li', {'class':'categories'})
    for sub_item in sub_items:
        print(sub_item.text)

This is what i´m trying to scrape:

<div class = 'menu'>
      <ul class = 'navigation' id = "news_ul">
        <li class = 'categories'> A </li>
          <ul>
            <li class = 'categories'> A1 </li>
            <li class = 'categories'> A2 </li>
            <li class = 'categories'> A3 </li>
          </ul>
        <li class = 'categories'> B </li>
          <ul>
            <li class = 'categories'> B1 </li>
          </ul>
        <li class = 'categories'> C </li>
          <ul>
            <li class = 'categories'> C1 </li>
            <li class = 'categories'> C2 </li>
          </ul>
        <li class = 'categories'> D </li>
          <ul>
            <li class = 'categories'> D1 </li>
          </ul>

This gives me all <li> elements (A,A1,A2,A3,B,B1,C,C1,C2,D,D1)

I would like to get only elements

<li>A</li>
<li>B</li>
<li>C</li>
<li>D</li>

CodePudding user response:

You can use recursive argument set to False:

soup_inicio \
 .find('div',{'class':'menu'}) \
 .find('ul', {'class': 'navigation'}) \
 .find_all('li', recursive=False)

this returns:

[<li > A </li>,
 <li > B </li>,
 <li > C </li>,
 <li > D </li>]

CodePudding user response:

Maybe > can do this:

lis = soup.select('#news_ul > li')
for li in lis:
    print(li)

##<li > A </li>
##<li > B </li>
##<li > C </li>
##<li > D </li>
  • Related