Home > Back-end >  scrapping web link from 247sports
scrapping web link from 247sports

Time:08-16

I am trying to grab a rankings history weblink from one url by using the following scrapping code

import requests
from bs4 import BeautifulSoup

url = 'https://247sports.com/Player/Trevor-Lawrence-61350/college-212444/'

pageTree = requests.get(url, headers=headers)
Soup = BeautifulSoup(pageTree.content, 'html.parser')

past_link = Soup.find_all('ul', {'class':'ranks-list'})

past_link

I was able to generate this output

[<ul >
 <li>
 <b>Natl.</b>
 <a href="https://247sports.com/Season/2018-Football/CompositeRecruitRankings/?InstitutionGroup=HighSchool">
 <strong>1</strong>
 </a>
 <a  href="https://247sports.com/PlayerSport/Trevor-Lawrence-at-Cartersville-116605/RecruitRankHistory/">
                     History
                 </a>
 </li>
 <li>
 <b>PRO</b>
 <a href="https://247sports.com/Season/2018-Football/CompositeRecruitRankings/?InstitutionGroup=HighSchool&amp;Position=PRO">
 <strong>1</strong>
 </a>
 </li>
 <li>
 <b>GA</b>
 <a href="https://247sports.com/Season/2018-Football/CompositeRecruitRankings/?InstitutionGroup=HighSchool&amp;State=GA">
 <strong>1</strong>
 </a>
 </li>
 <li>
 <b>All-Time</b>
 <a href="https://247sports.com/Sport/Football/AllTimeRecruitRankings/">
 <strong>6</strong>
 </a>
 </li>
 </ul>]

But going any further with something like as a "past_link.find_all('a')" led to running into errors. What do you think should be the next step from here? Any assistance is truly appreciated. Thanks in advance.

CodePudding user response:

To get rankings history link from that page you can use next example:

import requests
from bs4 import BeautifulSoup

url = "https://247sports.com/Player/Trevor-Lawrence-61350/college-212444/"
headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:103.0) Gecko/20100101 Firefox/103.0"
}
soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")

history_link = soup.select_one(".rank-history-link")["href"]
print(history_link)

Prints:

https://247sports.com/PlayerSport/Trevor-Lawrence-at-Cartersville-116605/RecruitRankHistory/
  • Related