I'm trying to scrape using python and selenium some webpages of sports results and I'm currently stuck with this one.
I would like to get a list of lists of lists. Let me explain with an example :
[
[ [Bet365,2.5,-,-] , ... , [Unibet.it,2.5,2.95,1.38] , [WilliamHill.it,2.5,-,-] ] ,
[ [Bet365,17.5,-,-] , ... , [Unibet.it,17.5,1.37,3.05] , [WilliamHill.it,17.5,-,-] ] ,
...
]
Even if I'm thinking a better way of taking this information, to make this lists I'm working like this:
tables_bodies = driver.find_elements(By.CSS_SELECTOR, "div[class='ui-table__body']")
# for every body, I find every raw
for body in tables_bodies :
body_raws.append(body.find_elements(By.CSS_SELECTOR, "div[class='ui-table__row']"))
for r in righe_del_body :
raws.append(r.text) # --> this gives me an error since lists do not have a text attribute!
This is far from being right.
Any help would be kindly appreciated.
CodePudding user response:
I think this does what you want:
# Needed libs
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get('https://www.flashscore.it/partita/fXh6kI8A/#/comparazione-quote/over-under/finale')
# We get all the bookmakers
bookmakers = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@class='ui-table oddsCell__odds']")))
# For every bookmaker...
for i in range(1, len(bookmakers) 1):
# We get every row (bet356, williamhill etc)
rows = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, f"((//div[@class='ui-table oddsCell__odds'])[{i}]//div[@class='ui-table__row'])")))
print(f"Results for bookmaker number: {i}")
# For every row we get the title, set, over and under
for z in range(1, len(rows) 1):
title = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH, f"((((//div[@class='ui-table oddsCell__odds'])[{i}]//div[@class='ui-table__row']))//img)[{z}]"))).get_attribute('title')
set = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH, f"((((//div[@class='ui-table oddsCell__odds'])[{i}]//div[@class='ui-table__row']))[{z}]//span)[1]"))).text
over = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH, f"((((//div[@class='ui-table oddsCell__odds'])[{i}]//div[@class='ui-table__row']))[{z}]//span)[2]"))).text
under = WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.XPATH, f"((((//div[@class='ui-table oddsCell__odds'])[{i}]//div[@class='ui-table__row']))[{z}]//span)[3]"))).text
# I am printing the results, but you can save then as you want
print(f"{title}: [{set}, {over}, {under}]")
Your idea was basically correct, you only needed some adjustments, I hope comments in code helps to understand
Output:
Results for bookmaker number: 1
bet365.it: [2.5, -, -]
Eurobet.it: [2.5, -, -]
Planetwin365: [2.5, -, -]
Unibet.it: [2.5, 2.95, 1.38]
WilliamHill.it: [2.5, -, -]
Results for bookmaker number: 2
bet365.it: [17.5, -, -]
Eurobet.it: [17.5, -, -]
Planetwin365: [17.5, -, -]
Unibet.it: [17.5, 1.37, 3.05]
WilliamHill.it: [17.5, -, -]
And so on