Home > Back-end >  Scrape HTML with Beautiful Soup
Scrape HTML with Beautiful Soup

Time:08-24

I'm having trouble scraping content from the following enter image description here

Realize there are many similar questions here and elsewhere on the web, but I'm still stuck after referencing them. Thank you in advance for the help.

CodePudding user response:

Data is generating dynamically from external source via API. Bs4 can't parse/render JS that's why are grtting static portion of html only.

Example:

import pandas as pd
import requests

api_url = 'https://pregame.com/api/gamecenter/consensushistory?e=171763&s=40&r=1000&a=1&c=1&t=693'
r = requests.get(api_url)

df = pd.DataFrame(r.json()['Items'])
print(df)

Output:

         Id                  DateTime   Odds  ...  IsPickActionChanged  PickAction  PickPercentage
0    60149470   2021-10-17T12:39:05.18Z     10  ...                False         172              86
1    60147744  2021-10-17T12:16:32.793Z     10  ...                False         169              86
2    60146757   2021-10-17T12:00:41.64Z     10  ...                False         162              86
3    60146458  2021-10-17T11:55:49.823Z     10  ...                False         162              86
4    60146333  2021-10-17T11:53:50.477Z     10  ...                False         162              86
..        ...                       ...    ...  ...                  ...         ...             ...
130  59716689  2021-10-12T17:41:27.397Z     10  ...                False          14              82
131  59716636   2021-10-12T17:40:44.01Z     10  ...                False          14              82
132  59716531  2021-10-12T17:39:28.603Z     10  ...                False          14              82
133  59715523  2021-10-12T17:24:22.067Z     10  ...                False          13              81
134  59655757  2021-10-11T01:02:33.873Z  Other  ...                 True           1             100

[135 rows x 12 columns]
  • Related