Home > Software engineering >  Scraping a website keeps returning 'NoneType' object has no attribute 'find' err
Scraping a website keeps returning 'NoneType' object has no attribute 'find' err

Time:06-29

I'm new to BS4. So I'm having a hard time decoding why this error keeps coming. I want to find the book name, rank, author, rating and price of books, but every time I run the code, all of them keeps on returning that error.

Here's my code:

try:
    url = requests.get("https://www.amazon.in/gp/bestsellers/books/1318158031/ref=zg_bs_nav_books_1")
    url.raise_for_status()

    soup = BeautifulSoup(url.text, "html.parser")

    books = soup.find("div", class_="p13n-gridRow _cDEzb_grid-row_3Cywl").find_all("div", id="gridItemRoot")
    
    for book in books:
        rank = book.find("div", class_="aok-float-left").span.text.split("#")[1]
        name = book.find("div", class_="zg-grid-general-faceout").span.div.text
        author = book.find("div", class_="a-row a-size-small").div.text
        rating = book.find("div", class_="a-icon-row").find("a", class_="a-link-normal")(["title"])
        price = book.find("div", class_="a-row").next_sibling.next_sibling.next_sibling.span.text

        print(name)
    
    

except Exception as e:
    print(e)

CodePudding user response:

You are getting NoneType error because some items are missing in the listing/in html tree. So to ged rid of such error, you can use if else None statement

import pandas as pd
import requests
from bs4 import BeautifulSoup
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36'}
url='https://www.amazon.in/gp/bestsellers/books/1318158031/ref=zg_bs_nav_books_1'
req = requests.get(url,headers=headers)
print(req)
soup = BeautifulSoup(req.text, 'lxml')
books = soup.find("div", class_="p13n-gridRow _cDEzb_grid-row_3Cywl").find_all("div", id="gridItemRoot")
    
for book in books:
    rank = book.find("div", class_="aok-float-left").span.text.split("#")[1]
    name = book.find("div", class_="zg-grid-general-faceout").span.div
    name=name.text if name else None
    author = book.find("div", class_="a-row a-size-small").div.text
    rating = book.find("span", class_="a-icon-alt")
    rating=rating.text if rating else None
    price = book.select_one(".a-size-base.a-color-price span").text
    #price=price.text if price else None
    print(price)

Output:

₹299.00
₹316.00
₹139.00
₹323.00
₹284.05
₹761.00
₹187.00
₹299.00
₹222.30
₹299.00
₹299.00
₹550.00
₹139.00
₹305.99
₹299.00
₹256.00
₹297.00
₹1,012.00
₹309.00
₹109.00
₹399.00
₹292.74
₹289.75
₹410.00
₹279.30
₹125.00
₹313.95
₹449.00
₹357.00
₹198.00
  • Related