Home > Back-end >  I want to web scraping from website their product price and name using python and only using beautif
I want to web scraping from website their product price and name using python and only using beautif

Time:05-24

i tried web scraping like this. I want to get price and name from product in the website.

and I dont know how to extract specific script include ""product details jason inline script.""<script type="application/ld json>"

so extract all jason inline script data using beautfulsoup and I Assign it to script. and i tried to many ways to extract specific one script but it dooesn't work. so i tried to slice like list.

i use indexing to extract specific script that i want. and I choose index[6] to isolate the specific script. and i assign variable to name "product script."

after I use some techniques to split and extract the price and product name.

But I want to another way to extract data from json inline script.

This my code:

def function_glomark_name(url_glomark):

    global product_name_glomark

    req2 = requests.get(url_glomark)

    product_request(req2)
    head_part = soup.find('head')
    scripts = head_part.find_all('script')

    product_script = scripts[6]
    
    #Remove tags    
    pd_list = product_script.contents
    for item in pd_list:
        product_des = item

    # make Dictionary
    product_glomark= json.loads(product_des)

    #Assign product_name_glomark

    product_name_glomark = (product_glomark['name'])
    print(product_name_glomark)
    return product_name_glomark

glomark_coconut = 'https://glomark.lk/coconut/p/11624'

#after calling function

function_glomark_name(glomark_coconut)

function_laughs_name(laughs_coconut)

output:Coconut

CodePudding user response:

To parse contents of the specific <script> you can use this example:

import json
import requests
from bs4 import BeautifulSoup


url = "https://glomark.lk/coconut/p/11624"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

s = soup.select_one('script[type="application/ld json"]')
data = json.loads(s.text)

for key, value in data.items():
    print(f"{key=} {value=}")

print("-" * 80)
print(f'Name is {data["name"]}')

Prints:

key='@context' value='https://schema.org'
key='@type' value='Product'
key='productID' value='11624'
key='name' value='Coconut'
key='description' value='Coconut'
key='url' value='/coconut/p/11624'
key='image' value='https://objectstorage.ap-mumbai-1.oraclecloud.com/n/softlogicbicloud/b/cdn/o/products/310310--01--1555692325.jpeg'
key='brand' value='GLOMARK'
key='offers' value=[{'@type': 'Offer', 'price': '92', 'priceCurrency': 'LKR', 'itemCondition': 'https://schema.org/NewCondition', 'availability': 'https://schema.org/InStock'}]
--------------------------------------------------------------------------------
Name is Coconut
  • Related