I am attempting to use beautifulsoup to look through and request each url in a txt file. So far I am able to scrape the first link for what I seek, progressing to the next url I hit an error.
This is the error I keep getting:
AttributeError: ResultSet object has no attribute 'find'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
from bs4 import BeautifulSoup as bs
import requests
import constants as c
file = open(c.fvtxt)
read = file.readlines()
res = []
DOMAIN = c.vatican_domain
pdf = []
def get_soup(url):
return bs(requests.get(url).text, 'html.parser')
for link in read:
bs = get_soup(link)
res.append(bs)
soup = bs.find('div', {'class': 'headerpdf'})
pdff = soup.find('a')
li = pdff.get('href')
surl = f"{DOMAIN}{li}"
pdf.append(f"{surl}\n")
print(pdf)
CodePudding user response:
It's your variable name confuses the Python interpreter, you cannot have the same name as a function and a variable at the same time, in your case 'bs'.
It should work fine if you rename the variable bs
to parsed_text
or something else but bs
.
for link in read:
parsed_text = get_soup(link)
res.append(parsed_text)
soup = parsed_text.find('div', {'class': 'headerpdf'})
pdff = soup.find('a')
li = pdff.get('href')
print(li)
surl = f"{DOMAIN}{li}"
pdf.append(f"{surl}\n")
print(pdf)
The result: