playerstats_url = 'https://www.pro-football-reference.com/boxscores/202110100tam.htm'
for week in weeks:
url1 = playerstats_url.format(week)
data1 = requests.get(url1)
with open('player/{}.html'.format(week), 'w ') as f:
f.write(data1.text)
soup = BeautifulSoup(page, 'html.parser')
week1_stats = soup.find('div', 'id':'team_stats')
tam2021 = pd.read_html(str(week1_stats))[0]
I am trying to pull the 'Team Stats' table from pro football reference website, but I keep getting 'ValueError: No tables found'
CodePudding user response:
This worked for me...
import requests
from bs4 import BeautifulSoup
import pandas as pd
html = requests.get('https://www.pro-football-reference.com/boxscores/202110100tam.htm')
soup = BeautifulSoup(html.text)
stats = soup.find('div', {'id':'all_player_offense'})
pd.read_html(str(stats))
Which returns...
[ Unnamed: 0_level_0 Unnamed: 1_level_0 Passing Rushing Receiving Fumbles
Player Tm Cmp Att Yds TD Int Sk Yds.1 Lng Rate Att Yds TD Lng Tgt Rec Yds TD Lng Fmb FL
0 Jacoby Brissett MIA 27 39 275 2 1 3 13 34 95.6 0 0 0 0 0 0 0 0 0 1 1
1 Myles Gaskin MIA 0 0 0 0 0 0 0 0 NaN 5 25 0 13 10 10 74 2 24 0 0
2 Preston Williams MIA 0 0 0 0 0 0 0 0 NaN 1 7 0 7 5 3 60 0 34 0 0
3 Salvon Ahmed MIA 0 0 0 0 0 0 0 0 NaN 2 5 0 4 3 2 16 0 11 0 0
4 Jaylen Waddle MIA 0 0 0 0 0 0 0 0 NaN 1 2 0 2 6 2 31 0 21 0 0
5 Mike Gesicki MIA 0 0 0 0 0 0 0 0 NaN 0 0 0 0 7 4 43 0 23 0 0
6 Durham Smythe MIA 0 0 0 0 0 0 0 0 NaN 0 0 0 0 3 2 23 0 21 0 0
7 Adam Shaheen MIA 0 0 0 0 0 0 0 0 NaN 0 0 0 0 2 2 15 0 10 0 0
8 Mack Hollins MIA 0 0 0 0 0 0 0 0 NaN 0 0 0 0 2 1 10 0 10 0 0
9 Isaiah Ford MIA 0 0 0 0 0 0 0 0 NaN 0 0 0 0 1 1 3 0 3 0 0
10 NaN NaN Passing Passing Passing Passing Passing Passing Passing Passing Passing Rushing Rushing Rushing Rushing Receiving Receiving Receiving Receiving Receiving Fumbles Fumbles
11 Player Tm Cmp Att Yds TD Int Sk Yds Lng Rate Att Yds TD Lng Tgt Rec Yds TD Lng Fmb FL
12 Tom Brady TAM 30 41 411 5 0 2 15 62 144.4 1 13 0 13 0 0 0 0 0 0 0
13 Blaine Gabbert TAM 3 3 41 0 0 0 0 23 118.7 3 -1 0 0 0 0 0 0 0 0 0
14 Leonard Fournette TAM 0 0 0 0 0 0 0 0 NaN 12 67 1 17 5 4 43 0 16 0 0
15 Ronald Jones II TAM 0 0 0 0 0 0 0 0 NaN 5 21 0 5 1 1 15 0 15 0 0
16 Giovani Bernard TAM 0 0 0 0 0 0 0 0 NaN 4 21 0 17 2 2 14 1 10 0 0
17 Antonio Brown TAM 0 0 0 0 0 0 0 0 NaN 0 0 0 0 8 7 124 2 62 0 0
18 Mike Evans TAM 0 0 0 0 0 0 0 0 NaN 0 0 0 0 8 6 113 2 34 0 0
19 Chris Godwin TAM 0 0 0 0 0 0 0 0 NaN 0 0 0 0 11 7 70 0 18 0 0
20 Tyler Johnson TAM 0 0 0 0 0 0 0 0 NaN 0 0 0 0 3 3 42 0 19 0 0
21 O.J. Howard TAM 0 0 0 0 0 0 0 0 NaN 0 0 0 0 3 2 19 0 10 0 0
22 Cameron Brate TAM 0 0 0 0 0 0 0 0 NaN 0 0 0 0 1 1 12 0 12 0 0]
EDIT FOR UPDATED QUESTION
Found that the table is commented after parsing using requests
and bs4
. I think the one on the site is dynamically loaded and the requests
library cannot handle pages that uses JavaScript to request info.
The solution below works perfectly fine but if you want the info that's dynamically loaded, possibly try using this library instead: https://pypi.org/project/requests-html/
import requests
from bs4 import BeautifulSoup
import pandas as pd
html = requests.get('https://www.pro-football-reference.com/boxscores/202110100tam.htm')
data = html.text.replace('<!--','').replace('-->','')
soup = BeautifulSoup(data)
stats = soup.find('div', {'id':'div_team_stats'})
pd.read_html(str(stats))
This returns...
[ Unnamed: 0 MIA TAM
0 First Downs 17 33
1 Rush-Yds-TDs 9-39-0 25-121-1
2 Cmp-Att-Yd-TD-INT 27-39-275-2-1 33-44-452-5-0
3 Sacked-Yards 3-13 2-15
4 Net Pass Yards 262 437
5 Total Yards 301 558
6 Fumbles-Lost 1-1 0-0
7 Turnovers 2 0
8 Penalties-Yards 5-37 6-47
9 Third Down Conv. 2-7 8-11
10 Fourth Down Conv. 0-0 0-0
11 Time of Possession 22:53 37:07]