When I parse for more than 1 class I get an error on line 12 (when I add all to find) Error: ResultSet object has no attribute 'find'. You're probably treating a list of elements like a single element
import requests
from bs4 import BeautifulSoup
heroes_page_list=[]
url = f'https://dota2.fandom.com/wiki/Dota_2_Wiki'
q = requests.get(url)
result = q.content
soup = BeautifulSoup(result, 'lxml')
heroes = soup.find_all('div', class_= 'heroentry').find('a')
for hero in heroes:
hero_url = heroes.get('href')
heroes_page_list.append("https://dota2.fandom.com" hero_url)
# print(heroes_page_list)
with open ('heroes_page_list.txt', "w") as file:
for line in heroes_page_list:
file.write(f'{line}\n')
CodePudding user response:
You are searching a
tag inside a list of div
tags you need to do like this,
heroes = soup.find_all('div', class_= 'heroentry')
a_tags = [hero.find('a') for hero in heroes]
for a_tag in a_tags:
hero_url = a_tag.get('href')
heroes_page_list.append("https://dota2.fandom.com" hero_url)
heroes_page_list
look like this,
['https://dota2.fandom.com/wiki/Abaddon',
'https://dota2.fandom.com/wiki/Alchemist',
'https://dota2.fandom.com/wiki/Axe',
'https://dota2.fandom.com/wiki/Beastmaster',
'https://dota2.fandom.com/wiki/Brewmaster',
'https://dota2.fandom.com/wiki/Bristleback',
'https://dota2.fandom.com/wiki/Centaur_Warrunner',
....
CodePudding user response:
The error is stating everything you need to do.
find() method is only usable on a single element. find_all() returns a list of elements. You are trying to apply find()
to a list
of elements.
If you want to apply find('a')
you should to something similar to this:
heroes = soup.find_all('div', class_= 'heroentry')
for hero in heroes:
hero_a_tag = hero.find('a')
hero_url = hero_a_tag .get('href')
heroes_page_list.append("https://dota2.fandom.com" hero_url)
You basically have to apply the find()
method on every element presents in the list generated by the find_all()
method