Home > Software design >  Python: Getting a table in CSV from a website without a table class
Python: Getting a table in CSV from a website without a table class

Time:09-18

I'm a newbie seeking help. I've tried without success with the following.

from bs4 import BeautifulSoup
import pandas as pd

url = "https://www.canada.ca/en/immigration-refugees-citizenship/corporate/mandate/policies-operational-instructions-agreements/ministerial-instructions/express-entry-rounds.html"
html_text = requests.get(url).text
soup = BeautifulSoup(html_text, 'html.parser')
data = []

# Verifying tables and their classes
print('Classes of each table:')
for table in soup.find_all('table'):
    print(table.get('class'))

Result: ['table'] None

Can anyone help me with how to get this data? Thank you so much.

CodePudding user response:

The data you see on the page is loaded from external URL. To load the data you can use next example:

import requests
import pandas as pd


url = "https://www.canada.ca/content/dam/ircc/documents/json/ee_rounds_123_en.json"

data = requests.get(url).json()
df = pd.DataFrame(data["rounds"])
df = df.drop(columns=["drawNumberURL", "DrawText1", "mitext"])

print(df.head(10).to_markdown(index=False))

Prints:

drawNumber drawDate drawDateFull drawName drawSize drawCRS drawText2 drawDateTime drawCutOff drawDistributionAsOn dd1 dd2 dd3 dd4 dd5 dd6 dd7 dd8 dd9 dd10 dd11 dd12 dd13 dd14 dd15 dd16 dd17 dd18
231 2022-09-14 September 14, 2022 No Program Specified 3,250 510 Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program September 14, 2022 at 13:29:26 UTC January 08, 2022 at 10:24:52 UTC September 12, 2022 408 6,228 63,860 5,845 9,505 19,156 16,541 12,813 58,019 12,245 12,635 9,767 11,186 12,186 68,857 35,833 5,068 238,273
230 2022-08-31 August 31, 2022 No Program Specified 2,750 516 Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program August 31, 2022 at 13:55:23 UTC April 16, 2022 at 18:24:41 UTC August 29, 2022 466 7,224 63,270 5,554 9,242 19,033 16,476 12,965 58,141 12,287 12,758 9,796 11,105 12,195 68,974 36,001 5,120 239,196
229 2022-08-17 August 17, 2022 No Program Specified 2,250 525 Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program August 17, 2022 at 13:43:47 UTC December 28, 2021 at 11:03:15 UTC August 15, 2022 538 8,221 62,753 5,435 9,129 18,831 16,465 12,893 58,113 12,200 12,721 9,801 11,138 12,253 68,440 35,745 5,137 238,947
228 2022-08-03 August 3, 2022 No Program Specified 2,000 533 Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program August 03, 2022 at 15:16:24 UTC January 06, 2022 at 14:29:50 UTC August 2, 2022 640 8,975 62,330 5,343 9,044 18,747 16,413 12,783 57,987 12,101 12,705 9,747 11,117 12,317 68,325 35,522 5,145 238,924
227 2022-07-20 July 20, 2022 No Program Specified 1,750 542 Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program July 20, 2022 at 16:32:49 UTC December 30, 2021 at 15:29:35 UTC July 18, 2022 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
226 2022-07-06 July 6, 2022 No Program Specified 1,500 557 Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program July 6, 2022 at 14:34:34 UTC November 13, 2021 at 02:20:46 UTC July 11, 2022 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
225 2022-06-22 June 22, 2022 Provincial Nominee Program 636 752 Provincial Nominee Program June 22, 2022 at 14:13:57 UTC April 19, 2022 at 13:45:45 UTC June 20, 2022 664 8,017 55,917 4,246 7,845 16,969 15,123 11,734 53,094 10,951 11,621 8,800 10,325 11,397 64,478 33,585 4,919 220,674
224 2022-06-08 June 8, 2022 Provincial Nominee Program 932 796 Provincial Nominee Program June 08, 2022 at 14:03:28 UTC October 18, 2021 at 17:13:17 UTC June 6, 2022 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
223 2022-05-25 May 25, 2022 Provincial Nominee Program 590 741 Provincial Nominee Program May 25, 2022 at 13:21:23 UTC February 02, 2022 at 12:29:53 UTC May 23, 2022 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
222 2022-05-11 May 11, 2022 Provincial Nominee Program 545 753 Provincial Nominee Program May 11, 2022 at 14:08:07 UTC December 15, 2021 at 20:32:57 UTC May 9, 2022 635 7,193 52,684 3,749 7,237 16,027 14,466 11,205 50,811 10,484 11,030 8,393 9,945 10,959 62,341 32,590 4,839 211,093
  • Related