Home > Software design >  Python WebDriver: find_element() not finding the element
Python WebDriver: find_element() not finding the element

Time:06-15

I'm learning the very basics of Web Scraping by following Chapter 12 of Automate the boring stuff with Python, but I'm having an issue with the find_element() method. When I use the method to look for an element with the class name 'card-img-top cover-thumb', the method doesn't return any matches. However, the code does work for URL's other than the example in the book.

I have had to make quite a few changes to the code as-written in order to get the code to do anything. I've posted the full code on GitHub HERE, but to summarise:

  • The book says to use 'find_element_by_*' methods, but these were producing depreciation messages that directed me to use find_element() instead.

  • To use this other method, I import 'By'.

  • I also import 'Service' from 'Selenium.Webdriver.Chrome.Service' because Chromedriver doesn't work otherwise.

  • I also define options with Webdriver.ChromeOptions() that hide certain error messages about a faulty device which apparently you're just supposed to ignore?

  • I put the code from the book into a function with 'url' and 'classname' arguments so I can test different url's without having to edit the code repeatedly.

Here is the 'business-part' of the code:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service  
from selenium.webdriver.common.by import By

s=Service(r'C:\Users\antse\AppData\Local\Chrome_WebDriver\chromedriver.exe')

op = webdriver.ChromeOptions()
op.add_experimental_option('excludeSwitches', ['enable-logging'])

def FNC_GET_CLASS_ELEMENT_FROM_PAGE(URL, CLASSNAME):       
    browser = webdriver.Chrome(service = s, options = op)
    browser.get(URL)
    try:  
        elem = browser.find_element(By.CLASS_NAME, CLASSNAME)
        print('Found <%s> element with that class name!' % (elem.tag_name))
    except:
        print('Was not able to find an element with that name.')

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

Expected output: Found <img> element with that class name!

Since the code does work when I look at a site like Wikipedia, I wonder if there have been changes to the html of the page that prevents the scrape from working properly?

Link to the book chapter HERE.

I appreciate any advice you can give me!

CodePudding user response:

You can't pass multiple classes to find_element. Only one can be present. So replace this:

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top cover-thumb')

with this:

FNC_GET_CLASS_ELEMENT_FROM_PAGE('https://inventwithpython.com', 'card-img-top')

If you really want to use both classes, then take a look at this answer which explains things in detail.

  • Related