i tried web scraping like this. I want to get price and name from product in the website.
and I dont know how to extract specific script include ""product details jason inline script.""<script type="application/ld json>"
so extract all jason inline script data using beautfulsoup and I Assign it to script. and i tried to many ways to extract specific one script but it dooesn't work. so i tried to slice like list.
i use indexing to extract specific script that i want. and I choose index[6] to isolate the specific script. and i assign variable to name "product script."
after I use some techniques to split and extract the price and product name.
But I want to another way to extract data from json inline script.
This my code:
def function_glomark_name(url_glomark):
global product_name_glomark
req2 = requests.get(url_glomark)
product_request(req2)
head_part = soup.find('head')
scripts = head_part.find_all('script')
product_script = scripts[6]
#Remove tags
pd_list = product_script.contents
for item in pd_list:
product_des = item
# make Dictionary
product_glomark= json.loads(product_des)
#Assign product_name_glomark
product_name_glomark = (product_glomark['name'])
print(product_name_glomark)
return product_name_glomark
glomark_coconut = 'https://glomark.lk/coconut/p/11624'
#after calling function
function_glomark_name(glomark_coconut)
function_laughs_name(laughs_coconut)
output:Coconut
CodePudding user response:
To parse contents of the specific <script>
you can use this example:
import json
import requests
from bs4 import BeautifulSoup
url = "https://glomark.lk/coconut/p/11624"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
s = soup.select_one('script[type="application/ld json"]')
data = json.loads(s.text)
for key, value in data.items():
print(f"{key=} {value=}")
print("-" * 80)
print(f'Name is {data["name"]}')
Prints:
key='@context' value='https://schema.org'
key='@type' value='Product'
key='productID' value='11624'
key='name' value='Coconut'
key='description' value='Coconut'
key='url' value='/coconut/p/11624'
key='image' value='https://objectstorage.ap-mumbai-1.oraclecloud.com/n/softlogicbicloud/b/cdn/o/products/310310--01--1555692325.jpeg'
key='brand' value='GLOMARK'
key='offers' value=[{'@type': 'Offer', 'price': '92', 'priceCurrency': 'LKR', 'itemCondition': 'https://schema.org/NewCondition', 'availability': 'https://schema.org/InStock'}]
--------------------------------------------------------------------------------
Name is Coconut