They show me the error ValueError: 60 columns passed, passed data had 282 columns
how to solve these error this is the page link
import requests
from bs4 import BeautifulSoup
import pandas as pd
headers= {'User-Agent': 'Mozilla/5.0'}
#put all item in this array
temp = []
response = requests.get('https://www.basketball-reference.com/boxscores/202110190MIL.html')
soup = BeautifulSoup(response.content, 'html.parser')
table=soup.find('table', class_='sortable stats_table')
headers=[tup.text for tup in table.find_all("th")]
for row in table:
temp.append([row.text for row in table.find_all('td')])
df = pd.DataFrame(temp,columns=headers)
print(df)
CodePudding user response:
In this case it's easier to just read the page directly with pandas:
tables = pd.read_html(response.text)
This gives you 16 tables, for both teams, both basic and advanced, headers, totals and all.
CodePudding user response:
beautifulsoup
answer
table=soup.select_one('table#box-BRK-game-basic')
headers=[tup.text for tup in table.select("thead tr:nth-of-type(2) th")]
for row in table.tbody.select('tr:not(.thead)'):
cells=[cell.text for cell in row.children]
temp.append(cells)