Home > Mobile >  I want to make site map using BeautifulSoup. But facing this problem "TypeError: 'NoneType
I want to make site map using BeautifulSoup. But facing this problem "TypeError: 'NoneType

Time:02-27

I'm trying to scrape sitemap from a site using beautifulsoup but I'm facing huge problem. There is my code, the error is

"TypeError: 'NoneType' object is not subscriptable"

Here is my code

import requests
from bs4 import BeautifulSoup as bs

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'}

url = "https://www.celebheights.com/"

res= requests.get(url, headers=headers)

html = bs(res.text, 'html.parser')

lilink = html.findAll('li')

for li in lilink:

    alink = li.find('a')['href']

    print(alink)

How can I solve this problem?

CodePudding user response:

You could use print() to see what you have in variables in line which make problem.

This page has some <li> without <a> and this makes problem.
You have to check what you have in alink because sometimes it is None.

for li in lilink:
    alink = li.find('a')
    
    if alink:
        url = alink['href']
        print(url)
    else:
        print('<li> without <a>:', li)

Result:

https://www.celebheights.com/
https://www.celebheights.com/comments.html
https://www.celebheights.com/s/latest_1.html
https://www.celebheights.com/s/compare.php
https://www.celebheights.com/s/top50.html
https://www.youtube.com/user/robpaul
<li> without <a>: <li id="ilsook"></li>
https://www.celebheights.com/s/latest_1.html
https://www.celebheights.com/s/Sean-Kanan-52921.html
https://www.celebheights.com/s/Michael-Parks-52920.html
https://www.celebheights.com/s/Harlan-Drum-52919.html
https://www.celebheights.com/s/Patricia-Medina-52918.html
https://www.celebheights.com/s/Nan-Leslie-52917.html
https://www.celebheights.com/s/Don-Cornelius-52916.html
https://www.celebheights.com/s/Maria-Sten-52915.html
https://www.celebheights.com/s/Bruce-McGill-52914.html
https://www.celebheights.com/comments.html
https://www.celebheights.com/s/compare.php
https://www.celebheights.com/s/top50.html
https://www.celebheights.com/s/Justin-Bieber-47348.html
https://www.celebheights.com/s/Tom-Cruise-3.html
https://www.celebheights.com/s/Brad-Pitt-371.html
https://www.celebheights.com/s/Arnold-Schwarzenegger-177.html
https://www.celebheights.com/s/Sylvester-Stallone-347.html
https://www.celebheights.com/sneakers/
https://www.celebheights.com/a/23.html
https://www.celebheights.com/a/
https://www.celebheights.com/s/tagsA.html

CodePudding user response:

Instead of res.text, try res.content.

  • Related