I am trying to scrape the data from NBA stats, specifically the team's boxscore. I am looking for the nba_api endpoint for this page so that i can scrape the data.
How can I find the endpoint?
CodePudding user response:
I'm not a huge sports fan, but this seems like it: A free NBA Boxscore API You could attempt to isolate the CSS, JS and HTML segments from the site.
It is in the source of a div.
CodePudding user response:
You find the endpoint by opening Dev Tools (sfht-ctrl-i) and look under Network -> XHR (you may need to refresh the page). Watch the panel for the requests to start popping up, and find the one that has your data. Go to Headers to find the info needed to make the request:
import requests
import pandas as pd
url = 'https://stats.nba.com/stats/leaguegamelog'
headers= {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
'Referer': 'https://www.nba.com/'}
payload = {
'Counter': '1000',
'DateFrom': '',
'DateTo': '',
'Direction': 'DESC',
'LeagueID': '00',
'PlayerOrTeam': 'T',
'Season': '2021-22',
'SeasonType': 'Regular Season',
'Sorter': 'DATE'}
jsonData = requests.get(url, headers=headers, params=payload).json()
rows = jsonData['resultSets'][0]['rowSet']
columns = jsonData['resultSets'][0]['headers']
df = pd.DataFrame(rows, columns=columns)
Output:
print(df)
SEASON_ID TEAM_ID TEAM_ABBREVIATION ... PTS PLUS_MINUS VIDEO_AVAILABLE
0 22021 1610612759 SAS ... 110 2 1
1 22021 1610612744 GSW ... 108 -2 1
2 22021 1610612761 TOR ... 93 5 1
3 22021 1610612755 PHI ... 88 -5 1
4 22021 1610612738 BOS ... 124 20 1
... ... ... ... ... ... ...
2133 22021 1610612754 IND ... 122 -1 1
2134 22021 1610612749 MIL ... 127 23 1
2135 22021 1610612751 BKN ... 104 -23 1
2136 22021 1610612744 GSW ... 121 7 1
2137 22021 1610612747 LAL ... 114 -7 1
[2138 rows x 29 columns]