Home > Blockchain >  Web scraping using BeautifulSoup - link embedded behind the marked up text
Web scraping using BeautifulSoup - link embedded behind the marked up text

Time:10-03

I am trying to scrape data from table of sectors

enter image description here

Below is the code I have so far. I got stuck on the part that I want to get the content of each sector. My code return nothing at all. Please help! Thank you in advance.

    import requests
    from bs4 import BeautifulSoup
    url = "https://eresearch.fidelity.com/eresearch/goto/markets_sectors/landing.jhtml"
    req = requests.get(url)
    soup = BeautifulSoup(req.content, "html.parser")
    
    links_list = list()
    next_page_link = soup.find_all("a", class_="heading1")
    for link in next_page_link:
        next_page = "https://eresearch.fidelity.com" link.get("href")
        links_list.append(next_page)
    
    for item in links_list:
        soup2 = BeautifulSoup(requests.get(item).content,'html.parser')
        print(soup2)

CodePudding user response:

Try:

import requests
from bs4 import BeautifulSoup

url = "https://eresearch.fidelity.com/eresearch/goto/markets_sectors/landing.jhtml"
sector_url = "https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector={sector_id}"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

print(
    "{:<30} {:<8} {:<8} {:<8} {}".format(
        "Sector name",
        "Moving",
        "MktCap",
        "MktWght",
        "Link",
    )
)
for a in soup.select("a.heading1"):
    sector_id = a["href"].split("=")[-1]

    u = sector_url.format(sector_id=sector_id)
    s = BeautifulSoup(requests.get(u).content, "html.parser")

    data = s.select("td:has(.timestamp) span:nth-of-type(1)")

    print(
        "{:<30} {:<8} {:<8} {:<8} {}".format(
            s.h1.text, *[d.text for d in data][:3], u
        )
    )

Prints:

Sector name                    Moving   MktCap   MktWght  Link
Communication Services          1.78%   $6.70T   11.31%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=50
Consumer Discretionary          0.62%   $8.82T   12.32%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=25
Consumer Staples                0.26%   $4.41T   5.75%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=30
Energy                          3.30%   $2.83T   2.60%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=10
Financials                      1.59%   $8.79T   11.22%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=40
Health Care                     0.07%   $8.08T   13.29%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=35
Industrials                     1.41%   $5.72T   8.02%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=20
Information Technology          1.44%   $15.52T  28.04%   https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=45
Materials                       1.60%   $2.51T   2.46%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=15
Real Estate                     1.04%   $1.67T   2.58%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=60
Utilities                      -0.04%   $1.56T   2.42%    https://eresearch.fidelity.com/eresearch/markets_sectors/sectors/sectors_in_market.jhtml?tab=learn&sector=55
  • Related