Home > Software engineering >  Scraping Beautiful Soup - find a value from a table
Scraping Beautiful Soup - find a value from a table

Time:08-19

After I scrape team names or date (not in table), I try to extract some odds from a table.

https://prnt.sc/Vcz_GNAz77ni url: https://www.oddsmath.com/football/england/premier-league-1281/2022-08-28/wolverhampton-wanderers-vs-newcastle-united-3882295/

from bs4 import BeautifulSoup
import requests

url = 'https://www.oddsmath.com/football/england/premier-league-1281/2022-08-28/wolverhampton-wanderers-vs-newcastle-united-3882295/'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')

match = soup.find('li', {'class':'active'}).text
print(match)

date = soup.find('time', {'class':'event-time'})['datetime']
print(date)

All_1X2_FT = soup.find_all('table', id = 'table-odds-cat-0')
print(All_1X2_FT)

Output:

Wolverhampton Wanderers vs Newcastle United
2022-08-28T15:00:00 02:00
[<table  id="table-odds-cat-0">
<thead></thead>
<tbody></tbody>
</table>]

But from here cannot find any way to proceed to a particular bookmaker (differentiated by 'data-x-id') or for a specific outcome (class = 'odds odds -1' for this example)

I appreciate any help on it, Thank you.

CodePudding user response:

That information is being hydrated in page from an API, after the page html is loaded, so you need to scrape that API endpoint for that particular table' data. This is one way of obtaining it:

import requests
import pandas as pd

url = 'https://www.oddsmath.com/api/v1/live-odds.json/?event_id=3882295&cat_id=0&include_exchanges=1&language=en&country_code=JP'
r = requests.get(url)
json_obj = r.json()['data']
the_list = []
for k in json_obj:
    try:
        the_list.append((k, json_obj[k]['x-id'], json_obj[k]['live']['updated'], json_obj[k]['live']['1'], json_obj[k]['live']['X'], json_obj[k]['live']['2']))
        
    except Exception as e:
        print(k, 'error')
df = pd.DataFrame(list(set(the_list)), columns = ['Betting Platform', 'X-ID', 'Updated at', '1', 'X', '2'])
print(df)

This would return:

Betfair error
Betting Platform    X-ID    Updated at  1   X   2
0   Campobet    64  2022-08-17 16:39:17 2.670   3.200   2.670
1   Bettogoal   83  2022-08-18 14:01:44 2.740   3.450   2.630
2   Suprabets   80  2022-08-18 14:07:53 2.800   3.530   2.690
3   10Bet   12  2022-08-18 12:45:17 2.750   3.300   2.650
4   Betway  16  2022-08-18 09:47:14 2.700   3.200   2.700
5   Winner  81  2022-08-17 16:02:29 2.750   3.750   2.500
6   Dafabet 13  2022-08-18 17:38:08 2.520   3.150   2.740
7   Titanbet    11  2022-08-17 16:03:53 2.650   3.200   2.700
8   Rabona  65  2022-08-17 16:39:17 2.670   3.200   2.670
9   Megapari    74  2022-08-16 19:27:11 2.835   3.500   2.642
10  Librabet    66  2022-08-17 16:39:24 2.670   3.200   2.670
11  BetClic 5   2022-08-17 16:03:15 2.700   3.300   2.730
12  Marathonbet 38  2022-08-16 19:29:25 2.780   3.450   2.590
13  bet-at-home 4   2022-08-17 16:02:35 2.650   3.200   2.650
14  DoubleBet   40  2022-08-16 19:27:12 2.835   3.500   2.642
15  FEZbet  84  2022-08-17 17:49:45 2.670   3.200   2.670
16  Tipico  56  2022-08-15 11:44:40 2.700   3.200   2.650
17  BetWinner   41  2022-08-16 19:33:07 2.835   3.500   2.642
18  Unibet  2   2022-08-15 06:50:45 2.800   3.400   2.500
19  1XBET   32  2022-08-16 19:27:12 2.835   3.500   2.642
20  888sport    35  2022-08-15 06:55:40 2.800   3.400   2.500
21  SBOBET  8   2022-08-18 10:52:16 2.651   3.016   2.555

You can further dissect and analyze that json response, to get some other info as well.

  • Related