Home > Software design >  url split with beautifulsoup
url split with beautifulsoup

Time:07-05

When I run the code below,

link = f"https://www.ambalajstore.com/kategori/bardak-tabak?siralama=fiyat:asc&stoktakiler=1&tp=1"
response = requests.get(link)
html_icerigi = response.content
corba = BeautifulSoup(html_icerigi,"html.parser")

for a in corba.find_all("div",{"class":"paginate-content"}):
    x = corba.find_all("div",{"class":"paginate-content"}) 
    print(x)

I get results:

[<div >
<a  href="javascript:void(0);">1</a>
<a href="/kategori/bardak-tabak?siralama=fiyat:asc&amp;stoktakiler=1&amp;tp=2">2</a>
<a href="/kategori/bardak-tabak?siralama=fiyat:asc&amp;stoktakiler=1&amp;tp=3">3</a>
<a href="/kategori/bardak-tabak?siralama=fiyat:asc&amp;stoktakiler=1&amp;tp=4">4</a>
<a href="javascript:void(0);">..</a>
<a href="/kategori/bardak-tabak?siralama=fiyat:asc&amp;stoktakiler=1&amp;tp=13">13</a>
</div>]

What I need is just the number 13 (last number) in the last line (<a href="/category/cup-plate?order=price:asc&amp;stock=1&amp;tp=13">13</a>)

Can you help me on how to do this?

CodePudding user response:

You can do it like this

corba.find("div",{"class":"paginate-content"}).find_all('a')[-1].text

this will give you the text content of the last item(13 in your case)

CodePudding user response:

As you have 1 div in x so you can get by following:

x.find_all('a')[-1].text

You can handle the case if no anchor tag found.

CodePudding user response:

There are different approaches possible to scrape the text of your element.

  • css selectors calling the last element of type:

    corba.select_one('.paginate-content a:last-of-type').text
    
  • picking last element by its list index:

    corba.find('div',{'class':'paginate-content'}).find_all('a')[-1].text
    
Example
from bs4 import BeautifulSoup
import requests

url = 'https://www.ambalajstore.com/kategori/bardak-tabak?siralama=fiyat:asc&stoktakiler=1&tp=1'
req = requests.get(url)

corba = BeautifulSoup(req.content)
corba.select_one('.paginate-content a:last-of-type').text
Output
13
  • Related