Im new to programming and cannot figure out why this wont loop. It prints and converts the first item exactly how I want. But stops after the first iteration.
from bs4 import BeautifulSoup
import requests
import re
import json
url = 'http://books.toscrape.com/'
page = requests.get(url)
html = BeautifulSoup(page.content, 'html.parser')
section = html.find_all('ol', class_='row')
for books in section:
#Title Element
header_element = books.find("article", class_='product_pod')
title_element = header_element.img
title = title_element['alt']
#Price Element
price_element = books.find(class_='price_color')
price_str = str(price_element.text)
price = price_str[1:]
#Create JSON
final_results_json = {"Title":title, "Price":price}
final_result = json.dumps(final_results_json, sort_keys=True, indent=1)
print(title)
print(price)
print()
print(final_result)
CodePudding user response:
First, clarify what you are looking for? Probably, you wish to print the title, price and final_result for every book that has been scraped from the URL books.toscrape.com. The code is working as it is written though the expectation is different. If you notice you are finding all the "ol" tags with class name = "row" and there's just one such element on the page thus, section has only one element eventually the for loop iterates just once.
How to debug it?
Check the type of section, type(section)
Print the section to know what it contains
write some print statements in for loop to understand what happens when
It isn't hard to debug this one. You need to change:
section = html.find_all('li', class_='col-xs-6 col-sm-4 col-md-3 col-lg-3')
CodePudding user response:
there is only 1 <ol>
in that doc
I think you want
for book in section[0].find_all('li'):
ol
means ordered list, of which there is one in this case, there are many li
or list items in that ol