I am using this website:(https://www.jurongpoint.com.sg/store-directory/) This is the area i am trying to scrape:
I am trying to scrape store name with their respective description.
This is my code:
import requests
from bs4 import BeautifulSoup
url="https://www.jurongpoint.com.sg/store-directory/?level=&cate=Food & Beverage&page=1"
data=requests.get(url)
soup=BeautifulSoup(data.content,"html.parser")
shops=soup.find_all(class_="table table-hover")
for k in shops:
name=[a.text for a in k.find_all("td")[0:2]]
desc=k.find_all('div',class_='col-9')
for q in desc:
print(" ".join(name))
print(q.text)
This is my output:
https://i.stack.imgur.com/BWNxL.png
As you can see from above, all the description are correct but then the shop name is wrong. Appreciate any help, thank you! :)
CodePudding user response:
When you create the variable name, always it will be ['\n','\n','4 Fingers Crispy Chicken']
.
I would suggest change slightly your code to that:
shops=soup.find_all('div',class_="col-9")
names=soup.find_all('tr',class_="clickable")
for n, k in zip(names, shops):
name = n.find_all('td')[1].text
desc = k.text
print(name "\n")
print(desc)
This manner will permit to iteract over the two variable simultaneously with the zip function. The ('td')[1] uses the fact that the name of the store always stay on the second cell from the tr tag, with class clickable. If you get the div tag with class col-9, you just need take the text of that.