Home > Software engineering >  Problem retrieving data using tag and class in BeautifulSoup
Problem retrieving data using tag and class in BeautifulSoup

Time:05-30

I am new on python and trying to write a code for web-scrapping. I am getting 'none' while requesting data using beautiful soup using tags and class. For some tags and class I retrieve the data successfully, however, for some I am unable to get anything. Please look at the code below:

Income_Statement = "https://www.investing.com/equities/sipchem-income-statement"
page_1 = requests.get(Income_Statement)

soup_1 = BeautifulSoup(page_1.content, 'html.parser')

content = soup_1.find('div', class_="wrapper hidden_navBar")

income = soup_1.find('table', class_="genTbl reportTbl")
print(content)
print(income)

This returns 'none' for the first command. Please help as I am struggling with it for long. Thanks in advance.

CodePudding user response:

Because you are getting 403 response. To get 200 response status, you have to inject user-agent as header as follows.

import requests
from bs4 import BeautifulSoup
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.62 Safari/537.36'}
Income_Statement = "https://www.investing.com/equities/sipchem-income-statement"
page_1 = requests.get(Income_Statement,headers=headers)
print(page_1)

soup_1 = BeautifulSoup(page_1.content, 'html.parser')

content = soup_1.find('div', class_="wrapper hidden_navBar")

income = soup_1.find('table', class_="genTbl reportTbl")
print(content)
print(income)

CodePudding user response:

When I run your code and check out the html received from the request I can see the request has been blocked and a different page has been received hence why soup_1.find() is not returning anything.

  • Related