Home > front end >  Beautiful Soup throws a None type attribute error when scraping a dynamic website
Beautiful Soup throws a None type attribute error when scraping a dynamic website

Time:09-23

I have a piece of code here and I want it to print out the values of a certain tag. I've gotten the names of the classes attributed to the element: 'main-line-status-txt' and 'status-up'. The values are generated client-side with JavaScript, so I was sure to put a generous timeout of 20 sec just to get it to work. This is what the relevant code block looks like:

from bs4 import BeautifulSoup
from webdriver_manager.microsoft import EdgeChromiumDriverManager
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec


def server_status():
    # Preload web page with selenium
    browser = webdriver.Edge(EdgeChromiumDriverManager().install())
    browser.maximize_window()
    wait = WebDriverWait(browser, 200)
    url = 'https://www.dualuniverse.game/server-status'
    browser.get(url)
    wait.until(ec.presence_of_element_located((By.CLASS_NAME, 'main-line-status-desc')))

soup = BeautifulSoup(browser.page_source, 'html.parser')
print(soup.find('main-line-status-txt').find('status-up'))

And here's the error it gives:

print(soup.find('main-line-status-txt').find('status-up'))

AttributeError: 'NoneType' object has no attribute 'find'

I'd love some pointers here. This is the first time I'm using Selenium and Beautiful Soup. Many thanks in advance.

Cheers.

CodePudding user response:

I assume you must be calling server_status somewhere; it's not part of your example. The problem is that the page_source attribute returns the page as it was originally sent from the browser, exactly as requests would have returned. That is, it is the source before any Javascript has been executed. Thus, the elements do not exist. Javascript code doesn't change the page source, it changes the object model in memory. You will need to use the Selenium find_elements_by_class_name to query the object model.

  • Related