My list xfrs
, returns a blank DF when I convert it....does anyone see any issues with the code?
I'm able to append and print the list fine, but when I append, the DF transfers
is blank.
url2 = 'https://247sports.com/Season/2020-Football/TransferPortalPositionRanking/'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
response = requests.get(url2, headers = headers)
soup = BeautifulSoup(response.content, 'html.parser')
xfrs = []
schools = []
for li in soup.findAll('li', attrs={'class':'transfer-player'}):
xfrs.append(li.find('a').contents)
schools.append(li.find('li', attrs={'class':'destination'}))
transfers = pd.DataFrame(xfrs, columns=['Players'])
print(transfers)
CodePudding user response:
As mentioned, .contents
returns a list of BeautifulSoup objects, so you need to use for example .text
to get the name. Also take care of your selection it should be more specific.
Storing the scraped data in a dataframe try to collect it as list of dicts:
data.append({
'Player':li.h3.text,
'Destination':destination['alt'] if (destination:=li.select_one('img[]')) else None
})
Example
import requests,json
from bs4 import BeautifulSoup as bs
url2 = 'https://247sports.com/Season/2020-Football/TransferPortalPositionRanking/'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
response = requests.get(url2, headers = headers)
soup = BeautifulSoup(response.content, 'html.parser')
data = []
for li in soup.find_all('li', attrs={'class':'transfer-player'}):
data.append({
'Player':li.h3.text,
'Destination':destination['alt'] if (destination:=li.select_one('img[]')) else None
})
pd.DataFrame(data)
Output
Player | Destination |
---|---|
JT Daniels | Georgia |
KJ Costello | Mississippi State |
Jamie Newman | Georgia |
... | ... |