I am trying to get the list of restaurants from this webpage, the type and the address. So far I wrote this:
import requests
from bs4 import BeautifulSoup
url = 'https://www.takeout2unow.com/restaurants?collapsable=1'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) '\
'AppleWebKit/537.36 (KHTML, like Gecko) '\
'Chrome/75.0.3770.80 Safari/537.36'}
response = requests.get(url,headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
results = []
for restaurant in soup.select('.dd_bs a.dd_restlink'):
results.append({
'title':restaurant.find('div',{'class':'menu__vendor-name ng-binding'}).text,
'address': restaurant.find('div',{'class':'vendor_details_item
vendor_details_address'}).text,
'details': restaurant.find('div',{'class':'vendor_details_item ng-
binding'}).text,
'type': restaurant.find('div',{'class':'servesCuisine'}).text
})
results
Output
[]
I do not know if I am selecting the wrong item or why I cannot get any data
CodePudding user response:
This is simply because the data you are trying to scrape isn't gotten when the website is loaded. Only a shell of the website is returned by the get request you sent and contains the general things on the website.
The data is later gotten by sending a post request to "https://www.takeout2unow.com/left.json.xsl" which contains a JSON file with all the data. NB: You can also use a get request with that link
It is important to tackle the most basic stages when writing scrapers, first of all confirm that you have data before trying to parse through it.
CodePudding user response:
Actually,There is nothing wrong in your code but data is dynamically loaded by javascript from api calls json response as Post method.
import requests
api_url = "https://www.takeout2unow.com/left.json.xsl"
jsonData = requests.post(api_url).json()
result=[]
for item in jsonData['vendors']['vendor']:
result.append({
'title':item['sortName'],
'address':item['vendoraddress'],
'details':item['vendordescription'],
'vendorType':item['cuisinetype']
})
print(result)
Output:
[{'title': 'Cleaning 2 You', 'address': '3372 US 79 North', 'details': 'Cleaning 2 You is
a locally owed and family operated cleaning service here to handle your residential and commercial cleaning needs! Call or message us today to set up a free quote! Licensed, bonded and insured.', 'vendorType': 'Retail'}, {'title': 'Kroger Paris', 'address': '1059 Mineral Wells Ave', 'details': "The Kroger Company, or simply Kroger, is an American retail company. It is the United States' largest supermarket by revenue, and the secondlargest general retailer.", 'vendorType': 'Groceries'}, {'title': 'Walmart Camden', 'address': '2200 US
... so on