I tried multiple ways to get the first level <li>
elements but i couldn´t find the correct way.
I´ve tried:
for item in soup_inicio.find_all('div',{'class':'menu'}):
sub_items = item.find_all('li', {'class':'categories'})
for sub_item in sub_items:
print(sub_item.text)
This is what i´m trying to scrape:
<div class = 'menu'>
<ul class = 'navigation' id = "news_ul">
<li class = 'categories'> A </li>
<ul>
<li class = 'categories'> A1 </li>
<li class = 'categories'> A2 </li>
<li class = 'categories'> A3 </li>
</ul>
<li class = 'categories'> B </li>
<ul>
<li class = 'categories'> B1 </li>
</ul>
<li class = 'categories'> C </li>
<ul>
<li class = 'categories'> C1 </li>
<li class = 'categories'> C2 </li>
</ul>
<li class = 'categories'> D </li>
<ul>
<li class = 'categories'> D1 </li>
</ul>
This gives me all <li>
elements (A,A1,A2,A3,B,B1,C,C1,C2,D,D1)
I would like to get only elements
<li>A</li>
<li>B</li>
<li>C</li>
<li>D</li>
CodePudding user response:
You can use recursive
argument set to False
:
soup_inicio \
.find('div',{'class':'menu'}) \
.find('ul', {'class': 'navigation'}) \
.find_all('li', recursive=False)
this returns:
[<li > A </li>,
<li > B </li>,
<li > C </li>,
<li > D </li>]
CodePudding user response:
Maybe >
can do this:
lis = soup.select('#news_ul > li')
for li in lis:
print(li)
##<li > A </li>
##<li > B </li>
##<li > C </li>
##<li > D </li>