python scrap data from .aspx web page-CodePudding

I want to scarp the table from this webpage to pandas table: https://www.perfectgame.org/College/CollegePlayerReports.aspx

I've used both requests and request-HTML but both don't seem to be effective,

from requests_html import HTMLSession
from requests import *
from bs4 import BeautifulSoup
import pandas as pd

def get_stats( name, year ) :

    with HTMLSession() as s :
        source = 'https://www.perfectgame.org/College/CollegePlayerReports.aspx'
        response = s.get( source )
        table = response.html.find('table.Grid', first=True)
        df = pd.read_html( table.html, header = 0 ) [ 0 ]
        print( df )

any solutions?

CodePudding user response：

To get data from table into pandas dataframe you can use next example:

import requests
import pandas as pd
from bs4 import BeautifulSoup


url = "https://www.perfectgame.org/College/CollegePlayerReports.aspx"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

data = []
for row in soup.select("tbody tr.rgRow, tbody tr.rgAltRow"):
    data.append(row.get_text(strip=True, separator="|").split("|"))

df = pd.DataFrame(
    data,
    columns=["Reports", "Draft Eligible", "Class", "College", "Report Date"],
)
print(df.to_markdown(index=False))

Prints:

Reports	Draft Eligible	Class	College	Report Date
Drew Williamson	2022	Senior	Alabama	6/1/2022
Caden Rose	2023	Sophomore	Alabama	6/1/2022
Wyatt Langford	2023	Sophomore	Florida	6/1/2022
Nick Ficarrotta	2022	Freshman	Florida	6/1/2022
Fisher Jameson	2024	Freshman	Florida	6/1/2022

...