Home > Enterprise >  How to get HTML changes after pressing button with Beautiful Soup and Requests
How to get HTML changes after pressing button with Beautiful Soup and Requests

Time:10-19

I want to get the HTML this site https://www.forebet.com/en/football-predictions after pressing the button More[ ] enough times to load all games. Each time the button More[ ] on the bottom of the page the HTML changes and shows more football games. How do I get the request to the page with all the football games loaded?

from bs4 import BeautifulSoup
import requests

leagues = {"EPL","UCL","Es1","De1","Fr1","Pt1","It1","UEL"}

class ForeBet:

#gets all games from the leagues on leagues returning the games on a string list
#game format is League|Date|Hour|Home Team|Away Team|Prob Home|Prob Tie| Prob Away
def get_games_and_probs(self):

    response=requests.get('https://www.forebet.com/en/football-prediction')
    soup = BeautifulSoup(response.text, 'html.parser')
    results=list()

    games = soup.findAll(class_='rcnt tr_0') soup.findAll(class_='rcnt tr_1')

    for game in games:
        if(leagues.__contains__(game.find(class_='shortTag').text.strip())):
            game=game.find(class_='shortTag').text "|" \
                game.find(class_='date_bah').text.split(" ")[0] "|"  \
                game.find(class_='date_bah').text.split(" ")[1] "|"  \
                game.find(class_='homeTeam').text "|" \
                game.find(class_='awayTeam').text "|" \
                game.find(class_='fprc').findNext().text "|" \
                game.find(class_='fprc').findNext().findNext().text "|" \
                game.find(class_='fprc').findNext().findNext().findNext().text
            print(game)
            results.append(game)

    return results

CodePudding user response:

Like stated, requests and beautfulsoup are used to parse data, not to interact with the site. To do that you need Selenium.

Your other option is to see if you can fetch the data directly, and see if there are parameters that can make another request as if you clicked the get more. Does this do the trick for you?

import pandas as pd
import requests

results = pd.DataFrame()
i=0
while True:
    print(i)
    url = 'https://m.forebet.com/scripts/getrs.php'
    payload = {
    'ln': 'en',
    'tp': '1x2',
    'in': '%s' %(i 11),
    'ord': '0'}
    
    jsonData = requests.get(url, params=payload).json()
    results = results.append(pd.DataFrame(jsonData[0]), sort=False).reset_index(drop=True)

    if max(results['id'].value_counts()) <=1:
        i =1
    else:
        results = results.drop_duplicates()
        break

Output:

print(results)
          id  pr_under  ...    country         full_name
0    1473708        31  ...    England   Isthmian League
1    1473713        35  ...    England   Isthmian League
2    1473745        28  ...    England   Isthmian League
3    1473710        35  ...    England   Isthmian League
4    1473033        28  ...    England  Premier League 2
..       ...       ...  ...        ...               ...
515  1419208        47  ...  Argentina  Torneo Federal A
516  1419156        57  ...  Argentina  Torneo Federal A
517  1450589        50  ...    Armenia    Premier League
518  1450590        35  ...    Armenia    Premier League
519  1450591        52  ...    Armenia    Premier League

[518 rows x 73 columns]
  • Related