So I am trying to get a table off of
any help is appriciated!
CodePudding user response:
The desired table data is in html comment.So You can invoke beautifulsoup built-in package which is Comment
with lambda function to grab data.
import pandas as pd
import requests
from bs4 import BeautifulSoup
from bs4 import Comment
url='https://www.baseball-reference.com/register/team.cgi?id=9995d2a1'
req=requests.get(url)
soup=BeautifulSoup(req.text,'lxml')
df = pd.read_html([x for x in soup.find_all(string=lambda text: isinstance(text, Comment)) if 'id="div_team_pitching"' in x][0])[0]
print(df)
Output:
Rk Name Age W L W-L% ... H9 HR9 BB9 SO9 SO/W Notes
0 1.0 Logan Bursick-Harrington 21.0 0 2 0.000 ... 4.5 0.0 15.8 15.8 1.00 NaN
1 2.0 Cylis Cox* 19.0 1 0 1.000 ... 23.1 0.0 7.7 11.6 1.50 NaN
2 3.0 Travis Densmore* 21.0 0 1 0.000 ... 7.2 0.0 1.8 14.4 8.00 NaN
3 4.0 Dylan Freeman 22.0 1 0 1.000 ... 13.5 1.1 3.4 14.6 4.33 NaN
4 5.0 Zach Hopman* 22.0 0 1 0.000 ... 12.8 0.0 9.9 11.4 1.14 NaN
5 6.0 Eamon Horwedel 22.0 1 0 1.000 ... 9.0 0.0 6.4 6.4 1.00 NaN
6 7.0 Tyler Johnson 19.0 0 0 NaN ... 5.4 0.0 2.7 10.8 4.00 NaN
7 8.0 Trent Jones 20.0 0 0 NaN ... 14.6 1.1 2.3 12.4 5.50 NaN
8 9.0 Tanner Knapp 21.0 1 1 0.500 ... 11.6 0.0 7.7 4.8 0.63 NaN
9 10.0 Mason Majors 22.0 1 0 1.000 ... 4.9 0.0 7.4 12.3 1.67 NaN
10 11.0 Mason Meeks 21.0 0 1 0.000 ... 6.3 0.9 3.6 5.4 1.50 NaN
11 12.0 Sam Nagelvoort 19.0 0 1 0.000 ... 18.0 2.3 22.5 9.0 0.40 NaN
12 13.0 Tyler Nichol 20.0 0 0 NaN ... 27.0 0.0 27.0 0.0 0.00 NaN
13 14.0 Cole Russo 19.0 0 0 NaN ... 27.0 13.5 0.0 0.0 NaN NaN
14 15.0 Kyle Salley* 22.0 0 1 0.000 ... 9.0 2.3 22.5 9.0 0.40 NaN
15 16.0 Noah Stants 21.0 0 0 NaN ... 4.3 1.4 7.1 11.4 1.60 NaN
16 17.0 Quinn Waterhouse* 21.0 0 0 NaN ... 4.5 0.0 4.5 18.0 4.00 NaN
17 18.0 Nick Weyrich 19.0 0 0 NaN ... 6.4 1.3 7.7 11.6 1.50 NaN
18 19.0 Adam Wheaton 23.0 0 1 0.000 ... 11.7 1.8 4.5 12.6 2.80 NaN
19 NaN 19 Players 20.9 5 9 0.357 ... 9.2 0.8 6.9 10.7 1.55 NaN
[20 rows x 32 columns]