Whats the error in this script?
from bs4 import BeautifulSoup
import requests
years = [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009,
2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022]
web = 'https://www.uefa.com/uefaeuropaleague/history/seasons/2022/matches/'
response = requests.get(web)
content = response.text
soup = BeautifulSoup(content, 'lxml')
matches = soup.find_all('div', class_='pk-match-unit size-m')
for match in matches:
print(match.find('div', class_='pk-match__base--team-home size-m').get_text())
print(match.find('div', class_='pk-match__score size-m').get_text())
print(match.find('div', class_='pk-match__base--team-away size-m').get_text())
I am not able to find the error, the purpose of the print is to obtain the data of the games of the last edition of the Europa League. I attach a picture of the html for reference, since I do not see where the error is. Keep in mind that I am only doing it for the year 2021.
I try to get the results from the group stage to the final.
CodePudding user response:
Always and first of all, take a look at your soup to see if all the expected ingredients are there.
Issue here is that data is loaded from an api so you won't get it with BeautifulSoup
if it is not in response of requests
- Take a look at your browsers dev tools on xhr tab and use this api call to get results from as JSON.
Example
Inspect the whole JSON to pick info that fit your needs
import requests
json_data = requests.get('https://match.uefa.com/v5/matches?competitionId=14&seasonYear=2022&phase=TOURNAMENT&order=DESC&offset=0&limit=20').json()
for m in json_data:
print(m['awayTeam']['internationalName'], f"{m['score']['total']['away']}:{m['score']['total']['home']}", m['homeTeam']['internationalName'])
Output
Rangers 1:1 Frankfurt
West Ham 0:1 Frankfurt
Leipzig 1:3 Rangers
Frankfurt 2:1 West Ham
Rangers 0:1 Leipzig
Braga 1:3 Rangers
West Ham 3:0 Lyon
Frankfurt 3:2 Barcelona
Leipzig 2:0 Atalanta
Rangers 0:1 Braga
...