Home > Software engineering >  Extracting a var from <script> tag in html
Extracting a var from <script> tag in html

Time:11-30

I am trying to web scrape product reviews from a page but I'm not sure how to extract a var inside the <script> tags.

Here's my python code:

import requests
from bs4 import BeautifulSoup
import csv

a_file = open("ProductReviews.csv", "a")
writer = csv.writer(a_file)

# Write the titles of the columns to the CSV file
writer.writerow(["created_at", "reviewer_name", "rating", "content", "source"])

url = 'https://www.lazada.com.my/products/iron-gym-total-upper-body-workout-bar-i467342383.html'

# Connect to the URL
response = requests.get(url)

# Parse HTML and save to BeautifulSoup object
soup = BeautifulSoup(response.content, "html.parser")

data = soup.findAll('script')[123]

if 'var __moduleData__' in data.string:
    print("Yes")

Here's the page source (I removed the unnecessary code):

<html>
<head>
    <title></title>
</head>
<body>

    <script>
        var __moduleData__ = {
        "data": {
            "root": {
                "fields": {
                    "review": {
                        "reviews": [{
                            "rating": 5,
                            "reviewContent": "tq barang dah sampai",
                            "reviewTime": "24 May 2021",
                            "reviewer": "Jaharinbaharin",

                        }, {
                            "rating": 5,
                            "reviewContent": "Beautiful quality           
  • Related