Home > database >  Why do I get AttributeError: 'NoneType' object has no attribute 'attrs'?
Why do I get AttributeError: 'NoneType' object has no attribute 'attrs'?

Time:12-20

import requests
import bs4
import csv
from itertools import zip_longest

laptop = []
laptops_price = []
links = []

url = "https://www.jumia.com.eg/ar/catalog/?q=لابتوب"
page = requests.get("https://www.jumia.com.eg/ar/catalog/?q=لابتوب")
bs = bs4.BeautifulSoup(page.content, 'html.parser')
laptops = bs.find_all('h3')
laptops_prices = bs.find_all("div", {"class": "prc"})
for l in range(len(laptops)):
    laptop.append(laptops[l].text)
    links.append(laptops[l].find("a", {"class" : "core"}).attrs['href'])
    laptops_price.append(laptops_prices[l].text)


laptops_list = [laptop, laptops_price, links]
exported = zip_longest(*laptops_list)
with open(r"C:\Users\Administrator\Desktop\jumiawep.csv", "w", encoding="utf-8") as jumialaptops:
    write = csv.writer(jumialaptops)
    write.writerow(["Laptop", "Price", "Links"])
    write.writerows(exported)
raceback (most recent call last):
  File "C:\Users\Administrator\PycharmProjects\pythonProject\main.py", line 17, in <module>
    links.append(laptops[l].find("a").attrs['href'])
AttributeError: 'NoneType' object has no attribute 'attrs'

I tried to get a list of links problem when I was scraping but i get this error.

CodePudding user response:

There are different things in my opinion:

  • website is protected by cloudflare, I am not able to request it from my location

Cloudflare is a global network designed to make everything you connect to the Internet secure, private, fast, and reliable. Secure your websites, APIs, and Internet applications. Protect corporate networks, employees, and devices. Write and deploy code that runs on the network edge.

  • <h3> do not have a child <a> that you try to find(), instead <h3> is a child of <a>

  • avoid the bunch of lists and process your scraping in one go.

Example

If you are not blocked by cloudflare and content is not rendered dynamically by javascript this should give you the expected result.

import requests, csv
from bs4 import BeautifulSoup

url = "https://www.jumia.com.eg/ar/catalog/?q=لابتوب"
soup = BeautifulSoup(requests.get(url).content)

with open(r"jumiawep.csv", "w", encoding="utf-8") as jumialaptops:
    write = csv.writer(jumialaptops)
    write.writerow(["Laptop", "Price", "Links"])

    for e in soup.select('article'):
        write.writerow([
            e.h3.text,
            e.select_one('.prc').text,
            e.a.get('href')
        ])
  • Related