I have this web scraper program in python, but it prints both tennis players Felix and Alexander. I would like to only print the first available tennis player as a separate item and exclude all the ones after it, so what do I need change in the code to do this?
To note, I did this through Visual Studio 2022 and applied the program to use Microsoft Edge web browser.
import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.betexplorer.com/tennis/atp-singles/basel/auger-aliassime-felix-bublik-alexander/U5HIueTc/")
webpage = response.content
soup = BeautifulSoup(webpage, "html.parser")
for h2 in soup.find_all('h2'):
values = [data for data in h2.find_all('a')]
for value in values:
print(value.text.replace(" ","_"))
print()
CodePudding user response:
Instead of looping through each tag individually you can use the select() function to find that specific tag and print the first one.
import requests
from bs4 import BeautifulSoup
response = requests.get("https://www.betexplorer.com/tennis/atp-singles/basel/auger-aliassime-felix-bublik-alexander/U5HIueTc/")
webpage = response.content
soup = BeautifulSoup(webpage, "html.parser")
print(soup.select('h2 a')[0].text.replace(' ','_'))
CodePudding user response:
Instead of the loop, just do
print(soup.h2.text.strip())