Home > Net >  Scraping specific table from a website Beautifulsoup
Scraping specific table from a website Beautifulsoup

Time:12-03

I want to get a specific table from this website named Form table (last 8) https://www.soccerstats.com/pmatch.asp?league=italy&stats=145-7-5-2022 but I got AttributeError: 'NoneType' object has no attribute 'text'

Code

  headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36'}
  s = requests.Session()
  s.headers.update(headers)

  response = requests.get(link, headers=headers)
  soup = BeautifulSoup(response.text, 'html.parser')

  standings_forms = soup.find_all('table', border='0', cellspacing='0', cellpadding='0', width='100%')
  for t in standings_forms:
    if t.find('b').text == 'Form table (last 8)':
      print(t)

CodePudding user response:

Try the following script to get the required information from that particular table. Before executing the script, make sure to upgrade your bs4 version by running this command pip install bs4 --upgrade as I have used pseudo css selectors within the script which bs4 supports only when it is of the latest version or at least equal to version 4.7.0.

import requests
from bs4 import BeautifulSoup

link = 'https://www.soccerstats.com/pmatch.asp?league=italy&stats=145-7-5-2022'

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    res = s.get(link)
    soup = BeautifulSoup(res.text,"html.parser")
    for item in soup.select("table:has(> tr > td > b:contains('Form table')) table > tr")[1:]:
        name = item.select("td")[0].get_text(strip=True)
        gp = item.select("td")[1].get_text(strip=True)
        pts = item.select("td")[2].get_text(strip=True)
        print((name,gp,pts))

The above script generates the following output:

('Atalanta', '8', '20')
('Inter Milan', '8', '17')
('AC Milan', '8', '16')
('Napoli', '8', '15')
('Juventus', '8', '13')
('Bologna', '8', '13')
('Fiorentina', '8', '12')
('Sassuolo', '8', '12')
('Hellas Verona', '8', '12')
('AS Roma', '8', '10')
('Empoli', '8', '10')
('Lazio', '8', '10')
('Venezia', '8', '10')
('Torino', '8', '9')
('Sampdoria', '8', '9')
('Udinese', '8', '8')
('Spezia', '8', '7')
('Cagliari', '8', '6')
('Genoa', '8', '5')
('Salernitana', '8', '4')
  • Related