Im looking to extract the weight of the products alone from the following site, i almost made it but the extracted output is the collection of some more things so how can i filter out only the weight of the product.
source : https://www.tendercuts.in/chicken
here is the code i used :
import requests
from bs4 import BeautifulSoup
baseurl = 'https://www.tendercuts.in/'
r = requests.get('https://www.tendercuts.in/chicken')
soup = BeautifulSoup(r.content, 'lxml')
producweight = soup.find_all('span', class_='callout')
print(productweight)
OUTPUT I DESIRE :
480 - 500 Gms
''''''''
''''''
''''''
so on ...
CodePudding user response:
To only get the weight
, you can use the following CSS selector:
import requests
from bs4 import BeautifulSoup
baseurl = 'https://www.tendercuts.in/'
r = requests.get('https://www.tendercuts.in/chicken')
soup = BeautifulSoup(r.content, 'lxml')
for tag in soup.select(".weight > span span:nth-of-type(2), .col-8 span:nth-of-type(2)"):
print(tag.next_sibling)
Output:
Customizable
Customizable
Customizable
480 - 500 Gms
190 - 210 Gms
480 - 500 Gms
280 - 360 Gms
...
...
Or to only get the fixed prices:
for tag in soup.select(".weight > span span:nth-of-type(2), .col-8 span:nth-of-type(2)"):
if tag.next_sibling != "Customizable":
print(tag.next_sibling)