I'm a newbie seeking help. I've tried without success with the following.
from bs4 import BeautifulSoup
import pandas as pd
url = "https://www.canada.ca/en/immigration-refugees-citizenship/corporate/mandate/policies-operational-instructions-agreements/ministerial-instructions/express-entry-rounds.html"
html_text = requests.get(url).text
soup = BeautifulSoup(html_text, 'html.parser')
data = []
# Verifying tables and their classes
print('Classes of each table:')
for table in soup.find_all('table'):
print(table.get('class'))
Result: ['table'] None
Can anyone help me with how to get this data? Thank you so much.
CodePudding user response:
The data you see on the page is loaded from external URL. To load the data you can use next example:
import requests
import pandas as pd
url = "https://www.canada.ca/content/dam/ircc/documents/json/ee_rounds_123_en.json"
data = requests.get(url).json()
df = pd.DataFrame(data["rounds"])
df = df.drop(columns=["drawNumberURL", "DrawText1", "mitext"])
print(df.head(10).to_markdown(index=False))
Prints:
drawNumber | drawDate | drawDateFull | drawName | drawSize | drawCRS | drawText2 | drawDateTime | drawCutOff | drawDistributionAsOn | dd1 | dd2 | dd3 | dd4 | dd5 | dd6 | dd7 | dd8 | dd9 | dd10 | dd11 | dd12 | dd13 | dd14 | dd15 | dd16 | dd17 | dd18 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
231 | 2022-09-14 | September 14, 2022 | No Program Specified | 3,250 | 510 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | September 14, 2022 at 13:29:26 UTC | January 08, 2022 at 10:24:52 UTC | September 12, 2022 | 408 | 6,228 | 63,860 | 5,845 | 9,505 | 19,156 | 16,541 | 12,813 | 58,019 | 12,245 | 12,635 | 9,767 | 11,186 | 12,186 | 68,857 | 35,833 | 5,068 | 238,273 |
230 | 2022-08-31 | August 31, 2022 | No Program Specified | 2,750 | 516 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | August 31, 2022 at 13:55:23 UTC | April 16, 2022 at 18:24:41 UTC | August 29, 2022 | 466 | 7,224 | 63,270 | 5,554 | 9,242 | 19,033 | 16,476 | 12,965 | 58,141 | 12,287 | 12,758 | 9,796 | 11,105 | 12,195 | 68,974 | 36,001 | 5,120 | 239,196 |
229 | 2022-08-17 | August 17, 2022 | No Program Specified | 2,250 | 525 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | August 17, 2022 at 13:43:47 UTC | December 28, 2021 at 11:03:15 UTC | August 15, 2022 | 538 | 8,221 | 62,753 | 5,435 | 9,129 | 18,831 | 16,465 | 12,893 | 58,113 | 12,200 | 12,721 | 9,801 | 11,138 | 12,253 | 68,440 | 35,745 | 5,137 | 238,947 |
228 | 2022-08-03 | August 3, 2022 | No Program Specified | 2,000 | 533 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | August 03, 2022 at 15:16:24 UTC | January 06, 2022 at 14:29:50 UTC | August 2, 2022 | 640 | 8,975 | 62,330 | 5,343 | 9,044 | 18,747 | 16,413 | 12,783 | 57,987 | 12,101 | 12,705 | 9,747 | 11,117 | 12,317 | 68,325 | 35,522 | 5,145 | 238,924 |
227 | 2022-07-20 | July 20, 2022 | No Program Specified | 1,750 | 542 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | July 20, 2022 at 16:32:49 UTC | December 30, 2021 at 15:29:35 UTC | July 18, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
226 | 2022-07-06 | July 6, 2022 | No Program Specified | 1,500 | 557 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | July 6, 2022 at 14:34:34 UTC | November 13, 2021 at 02:20:46 UTC | July 11, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
225 | 2022-06-22 | June 22, 2022 | Provincial Nominee Program | 636 | 752 | Provincial Nominee Program | June 22, 2022 at 14:13:57 UTC | April 19, 2022 at 13:45:45 UTC | June 20, 2022 | 664 | 8,017 | 55,917 | 4,246 | 7,845 | 16,969 | 15,123 | 11,734 | 53,094 | 10,951 | 11,621 | 8,800 | 10,325 | 11,397 | 64,478 | 33,585 | 4,919 | 220,674 |
224 | 2022-06-08 | June 8, 2022 | Provincial Nominee Program | 932 | 796 | Provincial Nominee Program | June 08, 2022 at 14:03:28 UTC | October 18, 2021 at 17:13:17 UTC | June 6, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
223 | 2022-05-25 | May 25, 2022 | Provincial Nominee Program | 590 | 741 | Provincial Nominee Program | May 25, 2022 at 13:21:23 UTC | February 02, 2022 at 12:29:53 UTC | May 23, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
222 | 2022-05-11 | May 11, 2022 | Provincial Nominee Program | 545 | 753 | Provincial Nominee Program | May 11, 2022 at 14:08:07 UTC | December 15, 2021 at 20:32:57 UTC | May 9, 2022 | 635 | 7,193 | 52,684 | 3,749 | 7,237 | 16,027 | 14,466 | 11,205 | 50,811 | 10,484 | 11,030 | 8,393 | 9,945 | 10,959 | 62,341 | 32,590 | 4,839 | 211,093 |