Home > Net >  How to extract all hrefs in a span with class?
How to extract all hrefs in a span with class?

Time:01-19

I want to scrape the lineups from the spanish liga in 2020/2021 season. I struggle with getting the player ids and the player names per game and per team.

gamedays_url = range(1,39)
url_list = []
daylinks = []
for gameday in gamedays_url:
        url = "https://www.transfermarkt.de/premier-league/spieltag/wettbewerb/ES1/plus/?saison_id=2020&spieltag="   str(gameday)
        url_list.append(url)
        response = requests.get(url, headers={'User-Agent': 'Custom5'})

homelineup = []

gameLinks = []
for i in range(len(url_list)):
    page = url_list
    tree = requests.get(page[i], headers = {'User-Agent': 'Custom5'})
    soup_2 = BeautifulSoup(tree.content, 'html.parser')
    links_2 = soup_2.find_all("a", {"class": "liveLink"}, href=re.compile("spielbericht"))
    for j in range(len(links_2)):
            gameLinks.append(links_2[j].get("href"))

for p in range(len(gameLinks)):
    page = gameLinks[p]

    response = requests.get(page, headers={'User-Agent': 'Custom5'})
    lineup_data = response.text
    soup = BeautifulSoup(lineup_data, 'html.parser')

#hometeam information
  homelineup = soup.find_all("div", {"class": "large-6 columns aufstellung-box"})
  for a in homelineup.select('span[class"aufstellung-rueckennummer-name"] a[href]'):
        home_test.append(a.get('href'))
        

But this does not work.

I have a problem by extracting the hrefs within a span that has a class. Moreover I need it for the home and away team.

The span class looks like this:

<span >
    <a href="/dani-parejo/profil/spieler/59561">Parejo</a>                                  
</span>

CodePudding user response:

Try to use the selectors in this way to select the first team:

soup.select('.aufstellung-box .aufstellung-rueckennummer-name a')

and to select the sibling container of first team and extract the second one:

soup.select('.aufstellung-box   div .aufstellung-rueckennummer-name a')

Also avoid these bunch of lists and try to store info in more structure way like dict or a single list of dicts, so you could access and transform data in an easy way.

Example

import requests
from bs4 import BeautifulSoup
base_url = 'https://www.transfermarkt.de'
url = 'https://www.transfermarkt.de/spielbericht/index/spielbericht/3431907'
soup = BeautifulSoup(requests.get(url, headers={'User-Agent': 'Custom5'}).content)

data = []

for e in soup.select('.aufstellung-box .aufstellung-rueckennummer-name a'):
    data.append({
        'team': e.find_previous('nobr').text,
        'player': e.text,
        'link': base_url e.get('href')
    })


for e in soup.select('.aufstellung-box   div .aufstellung-rueckennummer-name a'):
        data.append({
        'team': e.find_previous('nobr').text,
        'player': e.text,
        'link': base_url e.get('href')
    })


data

Output

[{'team': 'SD Eibar', 'player': 'Dmitrović', 'link': 'https://www.transfermarkt.de/marko-dmitrović/profil/spieler/94308'}, {'team': 'SD Eibar', 'player': 'Bigas', 'link': 'https://www.transfermarkt.de/pedro-bigas/profil/spieler/203043'}, {'team': 'SD Eibar', 'player': 'Oliveira', 'link': 'https://www.transfermarkt.de/paulo-oliveira/profil/spieler/139336'}, {'team': 'SD Eibar', 'player': 'Correa', 'link': 'https://www.transfermarkt.de/rober-correa/profil/spieler/223890'}, {'team': 'SD Eibar', 'player': 'Diop', 'link': 'https://www.transfermarkt.de/pape-diop/profil/spieler/39907'}, {'team': 'SD Eibar', 'player': 'Expósito', 'link': 'https://www.transfermarkt.de/edu-exposito/profil/spieler/506248'}, {'team': 'SD Eibar', 'player': 'Álvarez', 'link': 'https://www.transfermarkt.de/sergio-alvarez/profil/spieler/138935'}, {'team': 'SD Eibar', 'player': 'Inui', 'link': 'https://www.transfermarkt.de/takashi-inui/profil/spieler/98249'}, {'team': 'SD Eibar', 'player': 'León', 'link': 'https://www.transfermarkt.de/pedro-leon/profil/spieler/51587'}, {'team': 'SD Eibar', 'player': 'Enrich', 'link': 'https://www.transfermarkt.de/sergi-enrich/profil/spieler/81988'}, {'team': 'SD Eibar', 'player': 'García', 'link': 'https://www.transfermarkt.de/kike-garcia/profil/spieler/93936'}, {'team': 'Celta Vigo', 'player': 'Villar', 'link': 'https://www.transfermarkt.de/ivan-villar/profil/spieler/297194'}, {'team': 'Celta Vigo', 'player': 'Aidoo', 'link': 'https://www.transfermarkt.de/joseph-aidoo/profil/spieler/358250'}, {'team': 'Celta Vigo', 'player': 'Araújo', 'link': 'https://www.transfermarkt.de/nestor-araujo/profil/spieler/64134'}, {'team': 'Celta Vigo', 'player': 'Olaza', 'link': 'https://www.transfermarkt.de/lucas-olaza/profil/spieler/216240'}, {'team': 'Celta Vigo', 'player': 'Mallo', 'link': 'https://www.transfermarkt.de/hugo-mallo/profil/spieler/119905'}, {'team': 'Celta Vigo', 'player': 'Tapia', 'link': 'https://www.transfermarkt.de/renato-tapia/profil/spieler/277137'}, {'team': 'Celta Vigo', 'player': 'Yokuslu', 'link': 'https://www.transfermarkt.de/okay-yokuslu/profil/spieler/137616'}, {'team': 'Celta Vigo', 'player': 'Méndez', 'link': 'https://www.transfermarkt.de/brais-mendez/profil/spieler/309110'}, {'team': 'Celta Vigo', 'player': 'Nolito', 'link': 'https://www.transfermarkt.de/nolito/profil/spieler/70934'}, {'team': 'Celta Vigo', 'player': 'Mor', 'link': 'https://www.transfermarkt.de/emre-mor/profil/spieler/283223'}, {'team': 'Celta Vigo', 'player': 'Aspas', 'link': 'https://www.transfermarkt.de/iago-aspas/profil/spieler/72047'}]
  • Related