I'm having trouble extracting a particular link from each of the web pages I'm considering.
In particular, considering for example the following websites:
I would like to know if there is a unique way to extract the field WEBSITE (above the map) shown in the table on the left of the page. For the reported cases, I would like to extract the links:
There are no unique tags to refer to and this makes extraction difficult. I've thought of a solution using the selector, but it doesn't seem to work. For the first link I have:
from bs4 import BeautifulSoup
import requests
url = "https://lefooding.com/en/restaurants/ezkia"
res = requests.get(url)
soup = BeautifulSoup(res.text, 'html.parser')
data = soup.find("div", {"class": "e-rowContent"})
print(data)
but there is no trace of the link I need here. Does anyone know of a possible solution?
CodePudding user response:
Try this:
import requests
from bs4 import BeautifulSoup
urls = [
"https://lefooding.com/en/restaurants/ezkia",
"https://lefooding.com/en/restaurants/tekes",
]
with requests.Session() as s:
for url in urls:
soup = [
link.strip() for link
in BeautifulSoup(
s.get(url).text, "lxml"
).select(".pageGuide__infos a")[-1]
]
print(soup)
Output:
['https://www.ezkia-restaurant.fr']
['https://www.tekesrestaurant.com/']