Home > Back-end >  How to scrape live matches?
How to scrape live matches?

Time:03-10

I need to scrape Live matches only from https://www.livescore.com/en/football/live/. The code below now gives me back the full list of football matches available (not started, live and finished) with home team name, away team name and start time. What I need is a dataframe with live match, home team name, away team name and current minute of play.

THX

import requests
import pandas as pd
import datetime

url = "https://prod-public-api.livescore.com/v1/api/react/date/soccer/20220309/0.00?MD=1"
jsonData = requests.get(url).json()

rows = []
for stage in jsonData['Stages']:
    events = stage['Events']
    for event in events:
        gameDateTime = event['Esd']
        date_time_obj = datetime.datetime.strptime(str(gameDateTime), '%Y%m%d%H%M%S')
        gameTime = date_time_obj.strftime("%H:%M")
        homeTeam = event['T1'][0]['Nm']
        awayTeam = event['T2'][0]['Nm']
        
        row = {
            'Home':homeTeam,
            'Away':awayTeam,
            'Time':gameTime}
        rows.append(row)
        
df = pd.DataFrame(rows)

CodePudding user response:

It's just a matter of pulling the data from the correct endpoint. The live games come from https://prod-public-api.livescore.com/v1/api/react/live/soccer/0.00?MD=1

I added the score, but you can easily remove that if needed.

Code:

import requests
import pandas as pd
import datetime

url = "https://prod-public-api.livescore.com/v1/api/react/live/soccer/0.00?MD=1"
jsonData = requests.get(url).json()

rows = []
for stage in jsonData['Stages']:
    events = stage['Events']
    for event in events:
        gameDateTime = event['Esd']
        date_time_obj = datetime.datetime.strptime(str(gameDateTime), '%Y%m%d%H%M%S')
        gameTime = date_time_obj.strftime("%H:%M")
        homeTeam = event['T1'][0]['Nm']
        homeScore = event['Tr1']
        
        awayTeam = event['T2'][0]['Nm']
        awayScore = event['Tr2']
        
        matchClock = event['Eps']
        
        row = {
            'Home':homeTeam,
            'Home Score':homeScore,
            'Away':awayTeam,
            'Away Score':awayScore,
            'Match Clock':matchClock,
}
        rows.append(row)
        
live_df = pd.DataFrame(rows)

Output:

print(live_df)
                  Home Home Score              Away Away Score Match Clock
0          RC Relizane          1         JS Saoura          2       90 1'
1            Urartu FC          0      FC Alashkert          0         63'
2         BATE Borisov          1   Torpedo Zhodino          1          FT
3      Ethnikos Achnas          0  Enosis Paralimni          0         24'
4        Hearts of Oak          0              WAFA          0         25'
5   PAE Veria NFC 2019          1        AE Larissa          1       90 1'
6             Arema FC          1    Persib Bandung          2          FT
7  Buducnost Podgorica          0              Zeta          0         23'
8               Zilina          1           Komarno          0         68'
9             NK Celje          1           Domzale          0         65'
  • Related