from another thread I was suggested to specify this question on another topic. My issue is related to scraping a website that requires to scroll down the page dynamically and in the meanwhile copy the data in my dataframe.
Until now with the code below I can copy only the first elements in the page because they are the visible ones, but I need the whole list until the end of the page
driver.maximize_window()
wait=WebDriverWait(driver,30)
driver.get('https://www.livescore.com/en/')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
games = driver.find_elements(By.CSS_SELECTOR, 'div[class = "MatchRow_matchRowWrapper__1BtJ3"]')
data1 = []
for game in games:
data1.append({
'Home':game1.find_element(By.XPATH, './/div[contains(@class,"MatchRow_home")]').text,
'Away':game1.find_element(By.XPATH, './/div[contains(@class,"MatchRow_away")]').text,
'Time':game1.find_element(By.XPATH, './/span[contains(@id,"match-row")]').text
})
df = pd.DataFrame(data1) # create dataframe
print(df)
Any tips? THX
CodePudding user response:
My tip is get the data from the api. Far more efficient than using Selenium here:
import requests
import pandas as pd
import datetime
url = "https://prod-public-api.livescore.com/v1/api/react/date/soccer/20220309/0.00?MD=1"
jsonData = requests.get(url).json()
rows = []
for stage in jsonData['Stages']:
events = stage['Events']
for event in events:
gameDateTime = event['Esd']
date_time_obj = datetime.datetime.strptime(str(gameDateTime), '%Y%m%d%H%M%S')
gameTime = date_time_obj.strftime("%H:%M")
homeTeam = event['T1'][0]['Nm']
awayTeam = event['T2'][0]['Nm']
row = {
'Home':homeTeam,
'Away':awayTeam,
'Time':gameTime}
rows.append(row)
df = pd.DataFrame(rows)
Output:
print(df)
Home Away Time
0 Manchester City Sporting CP 20:00
1 Real Madrid Paris Saint-Germain 20:00
2 FC Porto Lyon 17:45
3 Real Betis Eintracht Frankfurt 17:45
4 Dundee FC St. Mirren 19:45
.. ... ... ...
281 Modafen FK Cankaya FK 11:00
282 UPDF FC Arua Hill SC 11:00
283 Wakiso Giants Mbarara City 13:00
284 Kokand 1912 Olympic 13:30
285 Nasaf Qarshi Metallurg Bekobod 13:30
[286 rows x 3 columns]