i have a page where i want to extract a ean number from a script tag (here it is 8806090571589)
I tried to get the script firstly with
jsonn = r.html.find('script')[3].text
print(title, price, jsonn)
however that didnt work.
the source code of the page is on here (too long to post):
view-source:https://www.kaufland.de/product/361834606/?search_value=waschmaschine
CodePudding user response:
When you use find(), it will return only the first occurrence of the tag. Since I can see that you need to find the 4th occurrence, you need to use the findAll() function. It will return a list of all the occurrences and then you can use any occurrence according to your needs.
I've tried using the below given code on my computer -
import urllib3
from bs4 import BeautifulSoup
URL = "https://www.kaufland.de/product/361834606/?search_value=waschmaschine"
response = urllib3.PoolManager().request("GET", URL, headers={'User-Agent' : "python"})
soup = BeautifulSoup(response.data.decode('utf-8'), 'html.parser')
print(soup.findAll("script")[3])
You can take this code for reference and modify as per your needs.