Home > Software engineering >  Extracting href from 'a' tag in beautiful soup
Extracting href from 'a' tag in beautiful soup

Time:09-29

import pandas as pd
import requests
from bs4 import BeautifulSoup

url = 'https://www.betexplorer.com/odds-movements/soccer/'
soup = BeautifulSoup(requests.get(url).content)
matches = soup.find_all("td", class_="table-main__tt")

The code above allows for the extraction of this line, how would i access the href in this result set. The target is the href link /soccer/usa/nisa/california-united-syracuse-pulse/Cz6Wt3DK/

enter image description here

CodePudding user response:

You are about to reach your destination. just create a for loop and make the urls absolute urls by adding the domain name

import pandas as pd
import requests
from bs4 import BeautifulSoup

url = 'https://www.betexplorer.com/odds-movements/soccer/'
soup = BeautifulSoup(requests.get(url).content,'lxml')
matches = soup.find_all("td", class_="table-main__tt")
for match in matches:
    link = 'https://www.betexplorer.com'   match.a.get('href')
    print(link)

Output:

https://www.betexplorer.com/soccer/usa/nisa/california-united-syracuse-pulse/Cz6Wt3DK/
https://www.betexplorer.com/soccer/europe/champions-league-women/juventus-koge/QVPxbOXg/
https://www.betexplorer.com/soccer/england/women-s-super-league/chelsea-west-ham/APprygcE/
https://www.betexplorer.com/soccer/italy/serie-d-group-b/seregno-calcio-brusaporto/jFQ7cE2j/
https://www.betexplorer.com/soccer/china/jia-league/shijiazhuang-gongfu-beijing-technology/hMapqmTR/
https://www.betexplorer.com/soccer/usa/usl-championship/atlanta-united-indy-eleven/z17PGtQS/
https://www.betexplorer.com/soccer/serbia/serbian-cup/macva-sabac-crvena-zvezda/KUtaIolr/
https://www.betexplorer.com/soccer/tanzania/ligi-kuu-bara/dodoma-jiji-geita-gold/fZuQPRCs/
https://www.betexplorer.com/soccer/europe/champions-league-women/real-madrid-rosenborg/hKMAfkR1/
https://www.betexplorer.com/soccer/europe/champions-league-women/bayern-munich-real-sociedad/hIRUap3s/
https://www.betexplorer.com/soccer/costa-rica/primera-division/cartagines-alajuelense/YJoh1ghF/
https://www.betexplorer.com/soccer/peru/liga-1/ad-cantolao-carlos-stein/b7ELqSxN/
https://www.betexplorer.com/soccer/uruguay/copa-uruguay/cerrito-montevideo-city/dAClWxIb/
https://www.betexplorer.com/soccer/argentina/copa-argentina/boca-juniors-quilmes/YeXavluo/
https://www.betexplorer.com/soccer/brazil/copa-do-brasil-u20/abc-lagarto/Ys8LTGdp/
https://www.betexplorer.com/soccer/usa/usl-championship/sacramento-republic-phoenix-rising/AgbiC0Yq/
https://www.betexplorer.com/soccer/england/fa-cup/fc-congleton-town-afc-fylde/0Cr7Zu9C/
https://www.betexplorer.com/soccer/argentina/primera-nacional/deportivo-madryn-sacachispas/ba4GSdAQ/
https://www.betexplorer.com/soccer/turkey/turkish-cup/diyarbekirspor-est-1977-kars-36-spor/z5RPnZgn/

CodePudding user response:

try:

matches[0].find_all("a")["href"]
  • Related