Home > Software engineering >  Check if text exists on a webpage using Python Selenium WebDriver
Check if text exists on a webpage using Python Selenium WebDriver

Time:12-27

I'm trying to use Selenium to see if a certain string exists on a webpage and return true/false accordingly. I can't seem to find anyone else who's used this to return a boolean.

I've included my code for the function below, although I've also tried a bunch of other variations including webdriverwait, body text search, search for element by xpath, unittest and assert, and more. Any suggestions would be greatly appreciated!

from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import pandas as pd
import time



# SET EVERYTHING UP FOR THE TASK
# open the spreadhseet and read in data
df = pd.read_csv('C:\\Users\\jdoe\\Downloads\\firms.csv')
# create new column for word search results
df ['WORD PRESENCE?'] = ""
# make sure pandas prints entire df
pd.set_option("display.max_rows", None, "display.max_columns", None)


# designate what you'd like to search for -- FILL IN THE WORD
wordsearch = "internet"
name_to_search = ""



# chrome driver path & location
driver = webdriver.Chrome(r"C:\\Users\\jdoe\\chromedriver.exe")

def setup():
    # reset the driver and open google
    driver.get("https://google.com")

def search(input_text):
    # access the search bar and send a command
    search = driver.find_element("name", "q")
    search.send_keys(input_text)
    search.send_keys(Keys.RETURN)

def opensite():
    # open the top search result
    website = driver.find_element(By.CLASS_NAME, "yuRUbf")
    website.find_element(By.XPATH, "./a").click()

def findtext(wordsearch):
    # find out if specific text is on the webpage and print result
    #return wordsearch in driver.page_source
    if wordsearch in driver.page_source:
        return True
    else:
        return False

def reset():
    # close the webdriver
    driver.quit()
    
    
    
# go through each column in the spreadsheet and fill everything in
for i, rows in df.iterrows():
    setup()
    
    # search for website and open it
    name_to_search = df.at[i, "NAME"]
    search(name_to_search)
    opensite()
    
    # fill in the website column of the spreadsheet
    current_website = driver.current_url
    df.at[i, "WEBSITE"] = current_website
    
    # find out of the designated word is on the webpage
    time.sleep(2)
    is_word_there = findtext(wordsearch)
    df.at[i, "WORD PRESENCE?"] = is_word_there
    
print(df)
reset()

CodePudding user response:

I figured it out. It just has to do with the case sensitivity in python so I wrote another function to adapt to all uppercase, all lowercase, and first-letter uppercase.

  • Related