Home > Blockchain >  Python Webscrape (using BeautifulSoup) question
Python Webscrape (using BeautifulSoup) question

Time:11-28

I am trying to webscrape this site https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park to get the Land Size (sqm) in the overview table. Result should give me 40,608

However, i am unable to get the result i want :( Here is my code

#[Python] test webscrap on edgeprop
import gspread
import json
from oauth2client.service_account import ServiceAccountCredentials
from openpyxl.worksheet import worksheet
from requests.api import request
import requests
import time
from requests.models import Response
import scrapy
from bs4 import BeautifulSoup
from six import add_metaclass, class_types


query_string='https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park'  
resp = requests.get(query_string)   
soup = BeautifulSoup(resp.content,'html.parser')
print("soup is: ", query_string)

try:
    landsize = soup.find_all("h4",class_="detail-title__text")
    print("Landsize is: ", landsize)

except IndexError:
    pass

As you can see, i am really new to this, hope to seek your guidance on this please :)

CodePudding user response:

Try this:

import json
import requests
from bs4 import BeautifulSoup

query_string='https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park'  

resp = requests.get(query_string) 
  
soup = BeautifulSoup(resp.content,'html.parser')

# get data with all info
data = soup.find("script", id="__NEXT_DATA__").text

# convert string to python dict
json_data = json.loads(data)

# get land_size from dict
print(json_data["props"]["pageProps"]["projectInfo"]["data"]["land_size"])
  • Related