Home > Software engineering >  'NoneType' object has no attribute 'find_all' error
'NoneType' object has no attribute 'find_all' error

Time:03-16

Help me please with my code.

relation_tables = char_soup.find('ul', class_='subNav').find_all('li')
like_page_url = url   relation_tables[2].find('a').get('href')  # Get like page's url
dislike_page_url = url   relation_tables[3].find('a').get('href')  # Get dislike page's url
like_r = requests.get(like_page_url)  # Get source of page with users who liked/disliked
dislike_r = requests.get(dislike_page_url)
like_soup = BeautifulSoup(like_r.text, 'html.parser')
dislike_soup = BeautifulSoup(dislike_r.text, 'html.parser')
like_pages = int(like_soup.find('ul', class_='nav').find_all('li')[13].text)
dislike_pages = int(dislike_soup.find('ul', class_='nav').find_all('li')[13].text)
n = like_soup.find('table', class_='pure-table striped').find_all('tr')  # WORKS
for i in range(0, like_pages):
    like_users_trs = like_soup.find('table', class_='pure-table striped').find_all('tr') # DON'T
    curr_character_like_names.extend([f'{url}{tr.find("a").text}' for tr in like_users_trs])  # Get
                                                                                                    # all users names
    like_page_url = url   like_soup.find('li', class_='next').find('a').get('href')  # and extend them to a list
    like_r = requests.get(like_page_url)  # Then find 'next' button and get next page's url
    like_soup = BeautifulSoup(like_r.text, 'html.parser')  # Get source of the next page

This code should take list with users names from page with users who liked character and who disliked(2 different pages). Problem is that one of 2 lines that do same thing don't work: n = like_soup.find('table', class_='pure-table striped').find_all('tr')(that line is just for test) That one is outside the loop and works good, but equal line inside the loop(like_users_trs = like_soup.find('table', class_='pure-table striped').find_all('tr')) throw error:

Traceback (most recent call last):
  File "/home/sekki/Documents/Pycharm/anime_planetDB/main.py", line 131, in <module>
    like_users_trs = like_soup.find('table', class_='pure-table striped').find_all('tr') # DON'T
AttributeError: 'NoneType' object has no attribute 'find_all'

Additional info:

CodePudding user response:

Looks like you are over complicating this a bit. Looking at the patterns, the names start to repeat once you got past the last page. So just do a while True loop until that happens.

Secondly, let pandas parse that table for you:

import pandas as pd
import requests


def get_date(url):
    df = pd.DataFrame(columns=[0])
    page = 1
    continueLoop = True
    while continueLoop == True:
        url_page = f'{url}?page={page}'
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
        response = requests.get(url_page, headers=headers).text
        temp_df = pd.read_html(response)[0]
        
        if list(temp_df[0])[0] not in list(df[0]):
            print(f'Collected Page: {page}')
            df = df.append(temp_df)
            page =1
        else:
            continueLoop = False
            
    return df


dfLoves = get_date('https://www.anime-planet.com/characters/armin-arlelt/loves')
dfHates = get_date('https://www.anime-planet.com/characters/armin-arlelt/hates')

Output:

print(dfLoves)
                  0
0         atsumuboo
1         Ken0brien
2           Kabooom
3          xsleepyn
4    camoteconpapas
..              ...
21        SonSoneca
22  SayaSpringfield
23          Kurasan
24     HikaruTenshi
0     silvertail123

[15026 rows x 1 columns]

print(dfHates)
                 0
0           selvnq
1   LiveLaughLuffy
2      SixxTheGoat
3        IceWolfTO
4         Sam234io
..             ...
11     phoenix5793
12          Tyrano
13      SimplyTosh
14      KrystaChan
15     SHADOWORZA0

[2591 rows x 1 columns]
  • Related