Home > Net >  Why I can't to scrape new Euroleague.net website with BeautifulSoup?
Why I can't to scrape new Euroleague.net website with BeautifulSoup?

Time:02-11

I want to scrape all links off specific games for specific week, which I can see through inspect, but it scrapes only links of next games, no matter, which page (gameweek) I try to scrape. https://www.euroleaguebasketball.net/euroleague/game-center/?round=1&season=E2021

soup.find_all('a', class_="game-card-view_linkWrap__u3Tea")

shows:

['/euroleague/game-center/2021-22/olympiacos-piraeus-anadolu-efes-istanbul/E2021/228/',
 '/euroleague/game-center/2021-22/alba-berlin-zenit-st-petersburg/E2021/227/',
 '/euroleague/game-center/2021-22/as-monaco-zalgiris-kaunas/E2021/226/',
 '/euroleague/game-center/2021-22/maccabi-playtika-tel-aviv-cska-moscow/E2021/229/',
 '/euroleague/game-center/2021-22/ax-armani-exchange-milan-bitci-baskonia-vitoria-gasteiz/E2021/230/',
 '/euroleague/game-center/2021-22/unics-kazan-crvena-zvezda-mts-belgrade/E2021/231/',
 '/euroleague/game-center/2021-22/fenerbahce-beko-istanbul-fc-bayern-munich/E2021/232/',
 '/euroleague/game-center/2021-22/ldlc-asvel-villeurbanne-panathinaikos-opap-athens/E2021/233/',
 '/euroleague/game-center/2021-22/real-madrid-fc-barcelona/E2021/234/']

but should be: links of game 1 - game 9.

CodePudding user response:

What happens?

Content of website is generated dynamically and requests could not interpret / render these like a browser can do.

How to fix?

Option#1:

Use the api to get the information of the matches:

URL = 'https://feeds.incrowdsports.com/provider/euroleague-feeds/v2/competitions/E/seasons/E2021/games?teamCode=&phaseTypeCode=RS&roundNumber=1'
headers =   {
        'accept':'*/*',
        'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
        }
r = requests.get(URL, headers=headers)
r.json()['data']

Option#2:

Use selenium to render the page like a browser will do and scrape data from driver.page_source

  • Related