Home > Software engineering >  Scraping with Selenium and XPaths
Scraping with Selenium and XPaths

Time:10-20

I'm using Selenium to scrape football odds. The teams are in HTML like this:

<div class="participant">
Kansas City Chiefs
<span class='participant country'>
</span>
</div>

The moneylines look like this:

...
<div class="option option-value ng-star-inserted">
<ms-font-resizer maxchars="6"> 145</ms-font-resizer>
</div>

I'm trying to extract the teams (one would be Kansas City Chiefs) and the odds ( 145 here) to lists.

from selenium import webdriver
import pandas as pd

url = "https://sports.va.betmgm.com/en/sports/football-11/betting/usa-9"
xpaths = ["//div[@class='participant']",
          "//div[@class='option option-value ng-star-inserted']"]

with webdriver.Firefox() as driver:
        driver.get(url)
        
        teams_elems = driver.find_elements_by_xpath(xpaths[0])
        mlines_elems = driver.find_elements_by_xpath(xpaths[1])
        
        mlines, teams = [], []
        for m, t in zip(mlines_elems, teams_elems):
            mlines.append(m.text)
            teams.append(t.text)
        
        driver.close()

This runs without errors but the lists of elements comes back empty. I think I'm using XPaths wrong. I used similar code on DraftKings and it worked but it's not working with this HTML. Thanks for any help with this.

CodePudding user response:

The mlines tag doesn't have one class named 'option option-value ng-star-inserted' it has 3 classes named 'option', 'option-value', and 'ng-star-inserted'.

If you want to locate a tag based on multiple classes you need to use contains.

"//div[contains(@class, 'option option-value ng-star-inserted')]"

Or if you want to locate those classes in any order try

"//div[contains(@class, 'option')][contains(@class, 'option-value')][contains(@class, 'ng-star-inserted')]"

CodePudding user response:

Try with below xpaths and confirm:

driver.get("https://sports.va.betmgm.com/en/sports/football-11/betting/usa-9")

i = 0
try:
    while True:
        teams = driver.find_elements_by_xpath("//ms-six-pack-event//div[@class='participant']")
        mlines = driver.find_elements_by_xpath("//ms-six-pack-event//div[@class='option-indicator']//following-sibling::div/ms-font-resizer")
        driver.execute_script("arguments[0].scrollIntoView(true);",teams[i])
        print(teams[i].text,mlines[i].text)
        i  = 1
        time.sleep(.5)
except:
    pass
Denver Broncos 1.91
Cleveland Browns 1.91
Washington Football Team 1.91
...
  • Related