from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
import pandas as pd
sca_url = "https://steakcookoffs.com/cookoffs?EventViewMode=1&EventListViewMode=1"
client = uReq(sca_url)
page_html = client.read()
page_soup = soup(page_html, features='lxml')
sca_reg_links_tags = page_soup.select(".inner a")
print(sca_reg_links_tags)
How can i just get the registration link?? I've also tried using sca_reg_links_tags = page_soup.find('div', {"class":"inner"}) but it would obtain the same thing.
CodePudding user response:
Try:
soup.find_all("a", string="Register")
Also, bs4 documentation:
https://www.crummy.com/software/BeautifulSoup/bs4/doc/
CodePudding user response:
Try as follows:
sca_reg_links_tags = page_soup.find_all('a', {'title': 'View event details'})
lst = []
for link in sca_reg_links_tags:
lst.append(link['href'] '/Registration')
lst[:5]
['https://steakcookoffs.com/event-4572070/Registration',
'https://steakcookoffs.com/event-4572070/Registration',
'https://steakcookoffs.com/event-4692176/Registration',
'https://steakcookoffs.com/event-4692176/Registration',
'https://steakcookoffs.com/event-4901583/Registration']
Happy cooking!