Home > Mobile >  Beautifulsoup select an element based on second last child
Beautifulsoup select an element based on second last child

Time:12-21

I'm trying to select second last child from the breadcrumbs section.

<div >
    <span><a href="/">Home</a></span>
    <i ></i>
    <span><a href="/list1/">List Name 1</a></span>
    <i ></i>
    <span><a href="/list2/">List Name 2</a></span>
    <i ></i>
    <span>List Name 3</span>
</div>

I write code in BS4 python to print second last child data to show (List Name 2)

r = requests.get(link)
soup = BeautifulSoup(r.content, 'lxml')  
    
listname = soup.select_one('.breadcrumbs span:nth-last-child(2) a').text
    
print(listname)

But it gives error:

AttributeError: 'NoneType' object has no attribute 'text'

Sometime page has 2 breadcrumbs and sometime has 3. That is why I only need second last name.

CodePudding user response:

Update selector with:

listname = soup.select_one('div.breadcrumbs span:nth-last-child(2) > a').text

CodePudding user response:

You can try to select all <a> inside breadcrumbs and use [-1] index:

from bs4 import BeautifulSoup


html_code = """\
<div >
    <span><a href="/">Home</a></span>
    <i ></i>
    <span><a href="/list1/">List Name 1</a></span>
    <i ></i>
    <span><a href="/list2/">List Name 2</a></span>
    <i ></i>
    <span>List Name 3</span>
</div>"""

soup = BeautifulSoup(html_code, "html.parser")

print(soup.select(".breadcrumbs a")[-1].text)

Prints:

List Name 2
  • Related