I'm trying to get the email from the city from http://www.comuni-italiani.it/110/index.html
I have the speceific child direction using xPath Finder which is /html/body/span[3]/table[2]/tbody/tr[1]/td[2]/table/tbody/tr[11]/td/b/a
. Now I'm trying to retrieve the email from this page but I know very little of BeatifulSoup
library (I'm just getting started). After reading several guides I managed to write the following code, but I'm not succesfull with indicating the child route correctly
from bs4 import BeautifulSoup
import requests
# sample web page
sample_web_page = 'http://www.comuni-italiani.it/110/index.html'
# call get method to request that page
page = requests.get(sample_web_page)
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
child_soup = soup.find('span')
for i in child_soup.children:
print("child : ", i)
What am I doing wrong??
CodePudding user response:
Please find my attempt to solve your problem below. It starts the same way as in your code, just has a bit of magic to find the email and print it out.
from bs4 import BeautifulSoup
import requests
sample_web_page = 'http://www.comuni-italiani.it/110/index.html'
page = requests.get(sample_web_page)
soup = BeautifulSoup(page.content, "html.parser")
email = soup.select_one('b > a[href^="mail"]')['href']
print(email.split(':')[1])