Home > Software engineering >  Unable to scrape all the listed results with beautifulsoup
Unable to scrape all the listed results with beautifulsoup

Time:03-16

https://www.ejendomstorvet.dk/ledigelokaler/koebenhavn-by/detailhandel-butik

I tried to scrape this real estate link. It shows it has 241 results, but I tried several times it can only scrape out 12 results.

from bs4 import BeautifulSoup
import requests
from csv import writer

url = "https://www.ejendomstorvet.dk/ledigelokaler/detailhandel-butik"
page = requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')
lists = soup.find_all('div', class_ ="propcontainer")

with open('ejendomstorvet_butik_03-140322.csv', 'w', encoding='utf8', newline='') as f:
thewriter = writer(f)
header = ['title','address_01','address_02','size', 'price','link']
thewriter.writerow(header)

for e in lists:
    title = e.find('div', class_="prop__intro").text.replace('\r\n','')
    address_01 = e.find('div', class_="prop__address").text.replace('\r\n','')
    address_02 = e.find('div', class_="prop__address2").text.replace('\r\n','')
    size = e.find('span', class_="prop__size").text.replace('\r\n','')
    price = e.find('span', class_="prop__price").text.replace('\r\n','')
    link = e.find('a' , href=True)


    info = [title,address_01,address_02,size,price,link]
    thewriter.writerow(info)
    

Result looks like this

CodePudding user response:

Actually full data is populated dynamically by JavaScript from api calls json response. Here is a working example how to collect all results.

Script:

import requests
import json
import pandas as pd
cookies={'Cookie': 'ASP.NET_SessionId=unma1hbvnwdd53w0iqfh5mfl; search=id=3ee38b0e-551b-4190-a617-cf0dd020d99a&itemtype=OwnUse&url=/ledigelokaler/koebenhavn-by/detailhandel-butik&convertedsearch=0; usercookie=id=f3c5912a-aa72-4a78-bff0-826eaab28072&c=MDMvMTQvMjAyMiAxOTo1NDoxNw==&data=JGYzYzU5MTJhLWFhNzItNGE3OC1iZmYwLTgyNmVhYWIyODA3MgCAo5vl0gXaiAEBJGYzYzU5MTJhLWFhNzItNGE3OC1iZmYwLTgyNmVhYWIyODA3Mglhbm9ueW1vdXMA; settings=usersettings=AAEAAAAAAAZsYXRlc3QABmNsb3NlZAA=; prism_610987956=e5c78891-283e-43ad-a8d9-43db0a7dd452; _clck=1xe2hka|1|ezr|0; CookieInformationConsent={"website_uuid":"81ea14e2-c192-405e-8523-46a926c64030","timestamp":"2022-03-14T15:55:15.512Z","consent_url":"https://www.ejendomstorvet.dk/ledigelokaler/koebenhavn-by/detailhandel-butik","consent_website":"Ejendomstorvet.dk","consent_domain":"www.ejendomstorvet.dk","user_uid":"7af36c12-088f-441a-a76a-1a52df8f229a","consents_approved":["cookie_cat_necessary","cookie_cat_functional","cookie_cat_statistic","cookie_cat_marketing","cookie_cat_unclassified"],"consents_denied":[],"user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36"}; _gcl_au=1.1.273465374.1647273316; _gid=GA1.2.924375637.1647273316; _ga_0Q5HR8S1F9=GS1.1.1647273305.1.1.1647273357.18; _ga=GA1.2.1641080384.1647273306; _uetsid=073761a0a3af11ecacdf8b5661088ce4; _uetvid=07379f70a3af11ecbff1db1567aff869; _clsk=nd9336|1647273413426|2|1|d.clarity.ms/collect'}
headers= {
    'X-Requested-With': 'XMLHttpRequest'}


api_url = "https://www.ejendomstorvet.dk/search/result?gethighlighted=false&imagewidth=620&imageheight=400"

jsonData=requests.get(api_url, headers=headers,cookies=cookies).json()


data=[]
for page in range(1,17,1):
    #print(page)    
    jsonData['NumberOfPages'] = page
    for item in jsonData['PropertyResultList']:
        title=item['Flashline']
        url=item['RefUrl']
        address=item['Address']
        city=item['City']
        data.append([title,url,address,city])
        #print(title)
cols=['title','url','address','city']

df = pd.DataFrame(data,columns=cols)
print(df)
#df.to_csv('output.csv',index=False) #to store data

Output:

                                     title  ...             city
0                Nyopført ejendom ved Ny Ellebjerg st.  ...     København SV
1                Nyopført ejendom ved Ny Ellebjerg st.  ...     København SV
2                Nyopført ejendom ved Ny Ellebjerg St.  ...     København SV
3                Nyopført ejendom ved Ny Ellebjerg St.  ...     København SV
4                Nyopført ejendom ved Ny Ellebjerg St.  ...     København SV
..                                                 ...  ...              ...
187                     Strandgade 7, 1401 København K  ...      København K
188  Centralt beliggende erhvervslokale på Frederik...  ...  Frederiksberg C
189                                   Charmerende café  ...      København Ø
190  Velbeliggende mindre butik/café centralt i Øre...  ...      København S
191        Produktionskøkken til leje på Frederiksberg  ...    Frederiksberg

[192 rows x 4 columns]
  • Related