Home > database >  Unable to scrape data from a local html file using Beautifulsoup
Unable to scrape data from a local html file using Beautifulsoup

Time:01-20

I tired using Beautifulsoup to scrape rows of table data from a locally available html file (download link provided below) without any success:

Here's my effort:

from bs4 import BeautifulSoup
import json


with open("web_summary.html", "r") as file:
    html_file = file.read()

soup = BeautifulSoup(html_file, "html.parser")

script = soup.find("div", {"data-component": "CellRangerSummary", "data-key": "summary"}).find('script')
table_data = json.loads(script.text.split('=')[1], encoding='utf-8')
summary_data = table_data['summary']
summary_tab = summary_data['summary_tab']

rows = summary_tab['table']['rows']

for row in rows:
    print(row[0],row[1])

enter image description here

  • Related