Home > database >  Scraping a website with data that's dynamically generated
Scraping a website with data that's dynamically generated

Time:10-30

I am trying to scrape the match table from this link: image for selecting

I tried looking at the network tab and it eventually got me to datatables.net. I however can't seem to figure out a way to get the data from that website. It seems to make a post request with certain headers, but it's unfortunately not very clear to me what that does.

There is no api call

CodePudding user response:

The desired table data isn't populated by JavaScript meaing data is in static HTML DOM and you can grab the table data using pandas DataFrame.

import pandas as pd
import requests
headers = {'user-agent':'Mozilla/5.0'}
url = 'https://www.kayak-polo.info/kpmatchs.php?lang=en&event=0&Saison=2022&Group=CM&Compet=*&J=*&Round=*&Css=&navGroup=1'
req= requests.get(url,headers=headers).text
df = pd.read_html(req)[0]
print(df)

Output:

          #             Date  ...                     Referee 2                                              Games
    0    501  2022-08-1610:20  ...             THOMAS Mark (GBR)  08-16 10:20 - Pitch 1  Group UW  ITA U21 Women...     
    1    503  2022-08-1610:20  ...           BELISLE Ricky (AUS)  08-16 10:20 - Pitch 3  Group UW  ESP U21 Women...     
    2    504  2022-08-1610:20  ...  ANDZIAK-GINTER Marzena (POL)  08-16 10:20 - Pitch 4  Group UW  NED U21 Women...     
    3    508  2022-08-1613:00  ...           BELISLE Ricky (AUS)  08-16 13:00 - Pitch 4  Group UW  ITA U21 Women...     
    4    506  2022-08-1813:15  ...           BELISLE Ricky (AUS)  08-18 13:15 - Pitch 2  Group UW  POL U21 Women...     
    ..   ...              ...  ...                           ...                                                ...     
    280  464  2022-08-1914:25  ...            WOLFF Sandra (GER)  08-19 14:25 - Pitch 4  Classifying 13-16  CZE ...     
    281  545  2022-08-2016:00  ...                           NaN    08-20 16:00 - Pitch 5th place  GBR U21 Women  -     
    282  195  2022-08-2112:05  ...                           NaN  08-21 12:05 - Pitch 5  21th place  UKR Men  Aw...     
    283  546  2022-08-2016:00  ...                           NaN    08-20 16:00 - Pitch 8th place  ITA U21 Women  -     
    284    #             Date  ...                     Referee 2                                                NaN     
    

[285 rows x 11 columns]
  • Related