I have a piece of code here and I want it to print out the values of a certain tag. I've gotten the names of the classes attributed to the element: 'main-line-status-txt' and 'status-up'. The values are generated client-side with JavaScript, so I was sure to put a generous timeout of 20 sec just to get it to work. This is what the relevant code block looks like:
from bs4 import BeautifulSoup
from webdriver_manager.microsoft import EdgeChromiumDriverManager
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
def server_status():
# Preload web page with selenium
browser = webdriver.Edge(EdgeChromiumDriverManager().install())
browser.maximize_window()
wait = WebDriverWait(browser, 200)
url = 'https://www.dualuniverse.game/server-status'
browser.get(url)
wait.until(ec.presence_of_element_located((By.CLASS_NAME, 'main-line-status-desc')))
soup = BeautifulSoup(browser.page_source, 'html.parser')
print(soup.find('main-line-status-txt').find('status-up'))
And here's the error it gives:
print(soup.find('main-line-status-txt').find('status-up'))
AttributeError: 'NoneType' object has no attribute 'find'
I'd love some pointers here. This is the first time I'm using Selenium and Beautiful Soup. Many thanks in advance.
Cheers.
CodePudding user response:
I assume you must be calling server_status
somewhere; it's not part of your example. The problem is that the page_source
attribute returns the page as it was originally sent from the browser, exactly as requests
would have returned. That is, it is the source before any Javascript has been executed. Thus, the elements do not exist. Javascript code doesn't change the page source, it changes the object model in memory. You will need to use the Selenium find_elements_by_class_name
to query the object model.