Home > OS >  PFF data scraping not recognizing text
PFF data scraping not recognizing text

Time:09-17

I am trying to scrape PFF.com for football grades with selenium, I am trying to get a specific grade for all Quarterbacks. Problem is, it doesn't seem like it's capturing the text as .text isn't working but I am not getting any NoSuchElementException.

Here's my code:

    service = Service(executable_path="C:\\chromedriver.exe")
    op = webdriver.ChromeOptions()
    driver = webdriver.Chrome(service=service, options=op)
    
    driver.get("https://premium.pff.com/nfl/positions/2022/REG/passing?position=QB")
    sleep(2)

    sign_in = driver.find_element(By.XPATH, '/html/body/div/div/header/div[3]/button')
    sign_in.click()
    sleep(2)

    email = driver.find_element(By.XPATH, '/html/body/div/div/div/div/div/div/form/div[1]/input')
    email.send_keys(my_email)

    password = driver.find_element(By.XPATH, 
    '/html/body/div/div/div/div/div/div/form/div[2]/input')
    password.send_keys(my_password)
    sleep(2)

    sign_in_2 = driver.find_element(By.XPATH, 
    '/html/body/div/div/div/div/div/div/form/button')
    sign_in_2.click()
    sleep(2)

    all_off_grades = driver.find_elements(By.CSS_SELECTOR, '.kyber-table 
    .kyber-grade-badge__info-text div')

    all_qb_names = driver.find_elements(By.CSS_SELECTOR, '.kyber-table .p-1 a')

    qb_grades = []
    qb_names = []


    for grade in all_off_grades:
        qb_grades.append(grade.text)

    for qb_name in all_qb_names:
        qb_names.append(qb_name.text)


    print(qb_grades)
    print(qb_names)        

The lists keep showing as empty.

Here are the elements I am trying to pull, but for every QB, I already confirmed the other QB's have the same class names for their grade and name.

<div >91.5</div> 

need to pull the 91.5

<a  href="/nfl/players/2022/REG/josh-allen/46601/passing">Josh Allen</a> 

need to pull Josh Allen

CodePudding user response:

@Jbuck3 I tried modifying the locator and it works for me. I am also giving the output I am getting. Let me know that is what you were expecting.

all_off_grades = driver.find_elements(By.CSS_SELECTOR, '.kyber-table-body__scrolling-rows-container .kyber-grade-badge__info-text')

all_qb_names = driver.find_elements(By.CSS_SELECTOR, "a[data-gtm-id = 'player_name']")

And the output I got is:

['91.5', '90.3', '74.6', '-', '-', '60.0', '84.3', '78.3', '78.1', '-', '-', '60.0', '82.8', '83.4', '-', '-', '-', '60.0']
['Josh Allen ', 'Geno Smith ', 'Kirk Cousins ', 'Marcus Mariota ', 'Jameis Winston ', 'Trey Lance ', 'Derek Carr ', 'Justin Fields ', 'Trevor Lawrence ', 'Russell Wilson ', 'Ryan Tannehill ', 'Tom Brady ', 'Tua Tagovailoa ', 'Mac Jones ', 'Davis Mills ', 'Matthew Stafford ', 'Baker Mayfield ', 'Lamar Jackson ', 'Joe Flacco ', 'Matt Ryan ', 'Jalen Hurts ', 'Daniel Jones ', 'Kyler Murray ', 'Justin Herbert ', 'Joe Burrow ', 'Aaron Rodgers ', 'Patrick Mahomes ', 'Mitchell Trubisky ', 'Dak Prescott ', 'Jacoby Brissett ', 'Carson Wentz ', 'Jared Goff ']

CodePudding user response:

Thank you for the response and help! I am still getting the same response with an empty string, but also it doesn't seem like your output is pulling the grades for all qbs,

I must be doing something else wrong.

I've edited my post to include my full code, but at this point I am just gonna download the CSV and go the pandas route. Was hoping to use webscraping so I didn't have to download a new CSV after every week of football.

Thank you!

  • Related