Home > Back-end >  extracting from specific class in beautiful soup
extracting from specific class in beautiful soup

Time:02-14

Im looking to extract the weight of the products alone from the following site, i almost made it but the extracted output is the collection of some more things so how can i filter out only the weight of the product.

source : https://www.tendercuts.in/chicken

here is the code i used :

   import requests
   from bs4 import BeautifulSoup

   baseurl = 'https://www.tendercuts.in/'

   r = requests.get('https://www.tendercuts.in/chicken')
   soup = BeautifulSoup(r.content, 'lxml')

   producweight = soup.find_all('span', class_='callout')

   print(productweight)

OUTPUT I DESIRE :

  480 - 500 Gms
     ''''''''
        ''''''
      ''''''
     so on ...

CodePudding user response:

To only get the weight, you can use the following CSS selector:

import requests
from bs4 import BeautifulSoup

baseurl = 'https://www.tendercuts.in/'

r = requests.get('https://www.tendercuts.in/chicken')
soup = BeautifulSoup(r.content, 'lxml')

for tag in soup.select(".weight > span span:nth-of-type(2), .col-8 span:nth-of-type(2)"):
    print(tag.next_sibling)

Output:

Customizable
Customizable
Customizable
 480 - 500 Gms
 190 - 210 Gms
 480 - 500 Gms
 280 - 360 Gms
 ...
 ...

Or to only get the fixed prices:

for tag in soup.select(".weight > span span:nth-of-type(2), .col-8 span:nth-of-type(2)"):
    if tag.next_sibling != "Customizable":
        print(tag.next_sibling)
  • Related