I'm trying to select second last child from the breadcrumbs section.
<div >
<span><a href="/">Home</a></span>
<i ></i>
<span><a href="/list1/">List Name 1</a></span>
<i ></i>
<span><a href="/list2/">List Name 2</a></span>
<i ></i>
<span>List Name 3</span>
</div>
I write code in BS4 python to print second last child data to show (List Name 2)
r = requests.get(link)
soup = BeautifulSoup(r.content, 'lxml')
listname = soup.select_one('.breadcrumbs span:nth-last-child(2) a').text
print(listname)
But it gives error:
AttributeError: 'NoneType' object has no attribute 'text'
Sometime page has 2 breadcrumbs and sometime has 3. That is why I only need second last name.
CodePudding user response:
Update selector with:
listname = soup.select_one('div.breadcrumbs span:nth-last-child(2) > a').text
CodePudding user response:
You can try to select all <a>
inside breadcrumbs and use [-1]
index:
from bs4 import BeautifulSoup
html_code = """\
<div >
<span><a href="/">Home</a></span>
<i ></i>
<span><a href="/list1/">List Name 1</a></span>
<i ></i>
<span><a href="/list2/">List Name 2</a></span>
<i ></i>
<span>List Name 3</span>
</div>"""
soup = BeautifulSoup(html_code, "html.parser")
print(soup.select(".breadcrumbs a")[-1].text)
Prints:
List Name 2