Home > Back-end >  Webscraping NBA results
Webscraping NBA results

Time:10-23

I want to learn webscraping in python, but I don't really know how or where to start. My code runs, but it only returns an empty string

import requests
import urllib
from urllib.request import urlopen
from bs4 import BeautifulSoup
#import pandas as pd

html = urllib.request.urlopen("https://www.nba.com/games")
soup= BeautifulSoup(html, "lxml")
games= soup.find_all("li", class_= "w-full flex flex-col flex-1 md:w-7/12 lg:w-5/12")
print(games)

CodePudding user response:

Your script is returning an empty string because there is no <li> element with the class you describe. There is, however, a <div>. Changing it to this will work:

games = soup.find_all("div", class_= "shadow-block bg-white flex md:rounded text-sm relative mb-4")

Giving you:

[<div class="shadow-block bg-white flex md:rounded text-sm relative mb-4"><div class="w-full flex flex-col flex-1 md:w-7/12 lg:w-5/12"><a class="flex-1 px-2 pt-5 h-full block hover:no-underline relative text-sm pt-5 pb-4 mb-1 px-2" href="/game/dal-vs-atl-0022100014"><div class="flex"><article class="w-1/4"><figure class="mx-auto mb-2" style="width:52px;height:52px"><div class="TeamLogo_block__1FJrR"><img alt=" Logo" class="TeamLogo_logo__1CmT9" loading="lazy" src="https://cdn.nba.com/logos/nba/1610612742/primary/L/logo.svg" title=" Logo"/></div></figure><div class="flex justify-center items-center"><span class="whitespace-no-wrap">Mavericks</span></div><p class="leading-none text-center">-</p></article><div class="flex justify-center flex-1 text-center mt-3"><div class="w-1/3 text-left"></div><div class="flex-col items-start justify-start flex-1 w-full"><div class="flex flex-col items-center"><p class="text-xs uppercase mt-2">FINAL</p></div></div><div class="w-1/3 text-right"></div></div><article class="w-1/4"><figure class="mx-auto mb-2" style="width:52px;height:52px"><div class="TeamLogo_block__1FJrR"><img alt=" Logo" class="TeamLogo_logo__1CmT9" loading="lazy" src="https://cdn.nba.com/logos/nba/1610612737/primary/L/logo.svg" title=" Logo"/></div></figure><div class="flex justify-center items-center"><span class="whitespace-no-wrap">Hawks</span></div><p class="leading-none text-center">-</p></article></div></a><ul class="flex border-concrete border-t"><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="0022100014" data-id="nba:games:watch" data-premium="true" data-section="Watch" data-text="DAL @ ATL, 2021-10-21" data-track="video" href="/game/dal-vs-atl-0022100014?watch">WATCH</a></li><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="DAL @ ATL, 2021-10-21" data-content-id="0022100014" data-id="nba:games:main:box-score:cta" data-text="BOX SCORE" data-track="click" data-type="cta" href="/game/dal-vs-atl-0022100014/box-score#box-score">BOX SCORE</a></li><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="DAL @ ATL, 2021-10-21" data-content-id="0022100014" data-id="nba:games:main:game-details:cta" data-text="GAME DETAILS" data-track="click" data-type="cta" href="/game/dal-vs-atl-0022100014">GAME DETAILS</a></li></ul></div><div class="w-full border-l border-concrete p-5 hidden md:block md:w-5/12 lg:w-5/12 xl:w-4/12 md:px-5 md:pt-3 lg:p-5"><div class="w-full"><p class="t7 mb-2">Game<!-- --> Leaders</p><table class="w-full"><thead class="text-xs font-condensed"><tr class="border-b border-asphalt text-asphalt"><th class="font-normal text-left">PLAYER</th><th class="font-normal text-right">PTS</th><th class="font-normal text-right">REB</th><th class="font-normal text-right">AST</th></tr></thead><tbody><tr class="border-b border-concrete"><td class="flex items-center w-full leading-tight py-2"><div class="w-6 h-6 mr-1"><p>-</p></div><div class="GameCardLeaders_player__2ZGgP"></div></td><td class="text-right">-</td><td class="text-right">-</td><td class="text-right">-</td></tr><tr class="border-b border-concrete"><td class="flex items-center w-full leading-tight py-2"><div class="w-6 h-6 mr-1"><p>-</p></div><div class="GameCardLeaders_player__2ZGgP"></div></td><td class="text-right">-</td><td class="text-right">-</td><td class="text-right">-</td></tr></tbody></table></div></div><div class="w-full p-5 hidden lg:w-2/12 lg:block xl:w-3/12 pl-0"><div class="w-full h-full"><p class="t7 mb-2">Game Recap</p>-</div></div></div>, <div class="shadow-block bg-white flex md:rounded text-sm relative mb-4"><div class="w-full flex flex-col flex-1 md:w-7/12 lg:w-5/12"><a class="flex-1 px-2 pt-5 h-full block hover:no-underline relative text-sm pt-5 pb-4 mb-1 px-2" href="/game/mil-vs-mia-0022100015"><div class="flex"><article class="w-1/4"><figure class="mx-auto mb-2" style="width:52px;height:52px"><div class="TeamLogo_block__1FJrR"><img alt=" Logo" class="TeamLogo_logo__1CmT9" loading="lazy" src="https://cdn.nba.com/logos/nba/1610612749/primary/L/logo.svg" title=" Logo"/></div></figure><div class="flex justify-center items-center"><span class="whitespace-no-wrap">Bucks</span></div><p class="leading-none text-center">-</p></article><div class="flex justify-center flex-1 text-center mt-3"><div class="w-1/3 text-left"></div><div class="flex-col items-start justify-start flex-1 w-full"><div class="flex flex-col items-center"><p class="text-xs uppercase mt-2">FINAL</p></div></div><div class="w-1/3 text-right"></div></div><article class="w-1/4"><figure class="mx-auto mb-2" style="width:52px;height:52px"><div class="TeamLogo_block__1FJrR"><img alt=" Logo" class="TeamLogo_logo__1CmT9" loading="lazy" src="https://cdn.nba.com/logos/nba/1610612748/primary/L/logo.svg" title=" Logo"/></div></figure><div class="flex justify-center items-center"><span class="whitespace-no-wrap">Heat</span></div><p class="leading-none text-center">-</p></article></div></a><ul class="flex border-concrete border-t"><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="0022100015" data-id="nba:games:watch" data-premium="true" data-section="Watch" data-text="MIL @ MIA, 2021-10-21" data-track="video" href="/game/mil-vs-mia-0022100015?watch">WATCH</a></li><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="MIL @ MIA, 2021-10-21" data-content-id="0022100015" data-id="nba:games:main:box-score:cta" data-text="BOX SCORE" data-track="click" data-type="cta" href="/game/mil-vs-mia-0022100015/box-score#box-score">BOX SCORE</a></li><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="MIL @ MIA, 2021-10-21" data-content-id="0022100015" data-id="nba:games:main:game-details:cta" data-text="GAME DETAILS" data-track="click" data-type="cta" href="/game/mil-vs-mia-0022100015">GAME DETAILS</a></li></ul></div><div class="w-full border-l border-concrete p-5 hidden md:block md:w-5/12 lg:w-5/12 xl:w-4/12 md:px-5 md:pt-3 lg:p-5"><div class="w-full"><p class="t7 mb-2">Game<!-- --> Leaders</p><table class="w-full"><thead class="text-xs font-condensed"><tr class="border-b border-asphalt text-asphalt"><th class="font-normal text-left">PLAYER</th><th class="font-normal text-right">PTS</th><th class="font-normal text-right">REB</th><th class="font-normal text-right">AST</th></tr></thead><tbody><tr class="border-b border-concrete"><td class="flex items-center w-full leading-tight py-2"><div class="w-6 h-6 mr-1"><p>-</p></div><div class="GameCardLeaders_player__2ZGgP"></div></td><td class="text-right">-</td><td class="text-right">-</td><td class="text-right">-</td></tr><tr class="border-b border-concrete"><td class="flex items-center w-full leading-tight py-2"><div class="w-6 h-6 mr-1"><p>-</p></div><div class="GameCardLeaders_player__2ZGgP"></div></td><td class="text-right">-</td><td class="text-right">-</td><td class="text-right">-</td></tr></tbody></table></div></div><div class="w-full p-5 hidden lg:w-2/12 lg:block xl:w-3/12 pl-0"><div class="w-full h-full"><p class="t7 mb-2">Game Recap</p>-</div></div></div>, <div class="shadow-block bg-white flex md:rounded text-sm relative mb-4"><div class="w-full flex flex-col flex-1 md:w-7/12 lg:w-5/12"><a class="flex-1 px-2 pt-5 h-full block hover:no-underline relative text-sm pt-5 pb-4 mb-1 px-2" href="/game/lac-vs-gsw-0022100016"><div class="flex"><article class="w-1/4"><figure class="mx-auto mb-2" style="width:52px;height:52px"><div class="TeamLogo_block__1FJrR"><img alt=" Logo" class="TeamLogo_logo__1CmT9" loading="lazy" src="https://cdn.nba.com/logos/nba/1610612746/primary/L/logo.svg" title=" Logo"/></div></figure><div class="flex justify-center items-center"><span class="whitespace-no-wrap">Clippers</span></div><p class="leading-none text-center">-</p></article><div class="flex justify-center flex-1 text-center mt-3"><div class="w-1/3 text-left"></div><div class="flex-col items-start justify-start flex-1 w-full"><div class="flex flex-col items-center"><p class="text-xs uppercase mt-2">FINAL</p></div></div><div class="w-1/3 text-right"></div></div><article class="w-1/4"><figure class="mx-auto mb-2" style="width:52px;height:52px"><div class="TeamLogo_block__1FJrR"><img alt=" Logo" class="TeamLogo_logo__1CmT9" loading="lazy" src="https://cdn.nba.com/logos/nba/1610612744/primary/L/logo.svg" title=" Logo"/></div></figure><div class="flex justify-center items-center"><span class="whitespace-no-wrap">Warriors</span></div><p class="leading-none text-center">-</p></article></div></a><ul class="flex border-concrete border-t"><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="0022100016" data-id="nba:games:watch" data-premium="true" data-section="Watch" data-text="LAC @ GSW, 2021-10-21" data-track="video" href="/game/lac-vs-gsw-0022100016?watch">WATCH</a></li><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="LAC @ GSW, 2021-10-21" data-content-id="0022100016" data-id="nba:games:main:box-score:cta" data-text="BOX SCORE" data-track="click" data-type="cta" href="/game/lac-vs-gsw-0022100016/box-score#box-score">BOX SCORE</a></li><li class="TabLink_tab__1ugCW block flex-1"><a class="block py-3 text-xs font-bold text-center text-cerulean Anchor_complexLink__2NtkO" data-content="LAC @ GSW, 2021-10-21" data-content-id="0022100016" data-id="nba:games:main:game-details:cta" data-text="GAME DETAILS" data-track="click" data-type="cta" href="/game/lac-vs-gsw-0022100016">GAME DETAILS</a></li></ul></div><div class="w-full border-l border-concrete p-5 hidden md:block md:w-5/12 lg:w-5/12 xl:w-4/12 md:px-5 md:pt-3 lg:p-5"><div class="w-full"><p class="t7 mb-2">Game<!-- --> Leaders</p><table class="w-full"><thead class="text-xs font-condensed"><tr class="border-b border-asphalt text-asphalt"><th class="font-normal text-left">PLAYER</th><th class="font-normal text-right">PTS</th><th class="font-normal text-right">REB</th><th class="font-normal text-right">AST</th></tr></thead><tbody><tr class="border-b border-concrete"><td class="flex items-center w-full leading-tight py-2"><div class="w-6 h-6 mr-1"><p>-</p></div><div class="GameCardLeaders_player__2ZGgP"></div></td><td class="text-right">-</td><td class="text-right">-</td><td class="text-right">-</td></tr><tr class="border-b border-concrete"><td class="flex items-center w-full leading-tight py-2"><div class="w-6 h-6 mr-1"><p>-</p></div><div class="GameCardLeaders_player__2ZGgP"></div></td><td class="text-right">-</td><td class="text-right">-</td><td class="text-right">-</td></tr></tbody></table></div></div><div class="w-full p-5 hidden lg:w-2/12 lg:block xl:w-3/12 pl-0"><div class="w-full h-full"><p class="t7 mb-2">Game Recap</p>-</div></div></div>]

CodePudding user response:

You'd be far better to get the data directly rather than parse the html. Not sure what data you want, but it's all in the json.

import requests

jsonData = requests.get("https://cdn.nba.com/static/json/liveData/scoreboard/todaysScoreboard_00.json").json()
scoreboard = jsonData['scoreboard']

gameDate = scoreboard['gameDate']
print(f'{gameDate}')
for game in scoreboard['games']:
    homeTeam = game['homeTeam']['teamCity']   ' '   game['homeTeam']['teamName']
    homeScore = game['homeTeam']['score']
    awayTeam = game['awayTeam']['teamCity']   ' '   game['awayTeam']['teamName']
    awayScore = game['awayTeam']['score']
    
    print(f'{awayTeam}: {awayScore} @ {homeTeam}: {homeScore}')

Output:

2021-10-21
Dallas Mavericks: 87 @ Atlanta Hawks: 113
Milwaukee Bucks: 95 @ Miami Heat: 137
LA Clippers: 113 @ Golden State Warriors: 115

CodePudding user response:

You should use this code. There was no div tag in the HTML. Do remember that BS4 runs on the backend. If there is no HTML on the source page, then the code will not work.

import requests
import urllib
from urllib.request import urlopen
from bs4 import BeautifulSoup
#import pandas as pd

html = urllib.request.urlopen("https://www.nba.com/games")
soup= BeautifulSoup(html, "lxml")
games= soup.find_all("div", class_= "w-full flex flex-col flex-1 md:w-7/12 lg:w-5/12")
print(games)

CodePudding user response:

Follow my code:

from bs4 import BeautifulSoup
import requests

source = requests.get('https://www.nba.com/games').text

print(source)

soup = BeautifulSoup(source, 'lxml')

print(soup)

games = soup.find('div', class_='w-full flex flex-col flex-1 md:w-7/12 lg:w-5/12').text

print(games)
  • Related