from bs4 import BeautifulSoup
import requests
url = 'https://www.iplt20.com/stats/2021/most-runs'
source = requests.get(url)
soup = BeautifulSoup(source.text, 'html.parser')
soup.find_all('table', class_ ='np-mostruns_table')
CodePudding user response:
Probably that's because the page is loaded via javascript. I saw people use mechanical soup instead. https://mechanicalsoup.readthedocs.io/en/stable/tutorial.html
CodePudding user response:
The website is fully javascript, you can't load javascript with requests.
You have to use an automated browser like selenium
or similar.
I also suggest using an extension when you are scraping to disable javascript (toggle on/off) like this
CodePudding user response:
If you are looking to find a table with class, you should use:
soup.find("table",{"class":"np-mostruns_table"})
CodePudding user response:
You can't get the table because it's loaded dynamically. You need to find the query that loads it, and build your table from it. It has many more fields than shown on the site, so you can add additional fields that you need. I gave an example only with those fields that are on the site
import requests
import json
import pandas as pd
url = 'https://ipl-stats-sports-mechanic.s3.ap-south-1.amazonaws.com/ipl/feeds/stats/60-toprunsscorers.js?callback=ontoprunsscorers'
results = []
response = requests.get(url)
json_data = json.loads(response.text[response.text.find('(') 1:response.text.find(')')])
for player in json_data['toprunsscorers']:
data = {
'Player': player['StrikerName'],
'Mat': player['Matches'],
'Inns': player['Innings'],
'NO': player['NotOuts'],
'Runs': player['TotalRuns'],
'HS': player['HighestScore'],
'AVG': player['BattingAverage'],
'BF': player['Balls'],
'SR': player['StrikeRate'],
'100': player['Centuries'],
'50': player['FiftyPlusRuns'],
'4s': player['Fours'],
'6s': player['Sixes']
}
results.append(data)
df = pd.DataFrame(results)
print(df)
OUTPUT:
Player Mat Inns NO Runs HS ... BF SR 100 50 4s 6s
0 Jos Buttler 17 17 2 863 116 ... 579 149.05 4 4 83 45
1 K L Rahul 15 15 3 616 103* ... 455 135.38 2 4 45 30
2 Quinton De Kock 15 15 1 508 140* ... 341 148.97 1 3 47 23
3 Hardik Pandya 15 15 4 487 87* ... 371 131.26 0 4 49 12
4 Shubman Gill 16 16 2 483 96 ... 365 132.32 0 4 51 11
.. ... .. ... .. ... ... ... ... ... .. .. .. ..
157 Fazalhaq Farooqi 3 1 1 2 2* ... 8 25.00 0 0 0 0
158 Jagadeesha Suchith 5 2 0 2 2 ... 8 25.00 0 0 0 0
159 Tim Southee 9 5 1 2 1* ... 12 16.66 0 0 0 0
160 Nathan Coulter-Nile 1 1 1 1 1* ... 2 50.00 0 0 0 0
161 Anrich Nortje 6 1 1 1 1* ... 6 16.66 0 0 0 0