Home > Mobile >  extracting text from div of div without any attributes for all the divs
extracting text from div of div without any attributes for all the divs

Time:01-22

I am using this website:(https://www.jurongpoint.com.sg/store-directory/) This is the area i am trying to scrape:

I am trying to scrape store name with their respective description.

This is my code:

import requests
from bs4 import BeautifulSoup
url="https://www.jurongpoint.com.sg/store-directory/?level=&cate=Food & Beverage&page=1"
data=requests.get(url)
soup=BeautifulSoup(data.content,"html.parser")
shops=soup.find_all(class_="table table-hover")

for k in shops:
    name=[a.text for a in k.find_all("td")[0:2]]
    desc=k.find_all('div',class_='col-9')
    for q in desc:
    
        print(" ".join(name))
        print(q.text)

This is my output:

https://i.stack.imgur.com/BWNxL.png

As you can see from above, all the description are correct but then the shop name is wrong. Appreciate any help, thank you! :)

CodePudding user response:

When you create the variable name, always it will be ['\n','\n','4 Fingers Crispy Chicken'].

I would suggest change slightly your code to that:

shops=soup.find_all('div',class_="col-9")
names=soup.find_all('tr',class_="clickable")

for n, k in zip(names, shops):
 name = n.find_all('td')[1].text
 desc = k.text
 print(name   "\n")
 print(desc)

This manner will permit to iteract over the two variable simultaneously with the zip function. The ('td')[1] uses the fact that the name of the store always stay on the second cell from the tr tag, with class clickable. If you get the div tag with class col-9, you just need take the text of that.

  • Related