Home > Software engineering >  BeautifulSoup not finding all items in Amazon
BeautifulSoup not finding all items in Amazon

Time:02-17

I wrote the following source code to scrape titles/authors from Amazon books. However, "find all" only returns me information from the first 30 books instead of all 50 books on the page.

I noticed that the first 30 books are the ones that have already been loaded without scrolling the search bar, but I'm not sure if this is the reason.

s = HTMLSession()
url = "https://www.amazon.com/Best-Sellers-Kindle-Store-Arts-Photography/zgbs/digital-text/154607011/ref=zg_bs_nav_digital-text_3_157325011"
r = s.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
books = soup.find_all("div", {"class":"_p13n-zg-list-grid-desktop_truncationStyles_p13n-sc-css-line-clamp-1__1Fn1y"})
    
    

CodePudding user response:

Try using the requests library and change the selector to something that is less dynamic than the class value you have used in your code. See below sample code using requests

from requests import session
from bs4 import BeautifulSoup

s = session()
url = "https://www.amazon.com/Best-Sellers-Kindle-Store-Arts-Photography/zgbs/digital-text/154607011/ref=zg_bs_nav_digital-text_3_157325011"
r = s.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
books = soup.find_all("div", {"id":"gridItemRoot"})

print(len(books))

You will get below print out in the terminal

50
  • Related