I am new on python and trying to write a code for web-scrapping. I am getting 'none' while requesting data using beautiful soup using tags and class. For some tags and class I retrieve the data successfully, however, for some I am unable to get anything. Please look at the code below:
Income_Statement = "https://www.investing.com/equities/sipchem-income-statement"
page_1 = requests.get(Income_Statement)
soup_1 = BeautifulSoup(page_1.content, 'html.parser')
content = soup_1.find('div', class_="wrapper hidden_navBar")
income = soup_1.find('table', class_="genTbl reportTbl")
print(content)
print(income)
This returns 'none' for the first command. Please help as I am struggling with it for long. Thanks in advance.
CodePudding user response:
Because you are getting 403 response. To get 200 response status, you have to inject user-agent
as header as follows.
import requests
from bs4 import BeautifulSoup
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.62 Safari/537.36'}
Income_Statement = "https://www.investing.com/equities/sipchem-income-statement"
page_1 = requests.get(Income_Statement,headers=headers)
print(page_1)
soup_1 = BeautifulSoup(page_1.content, 'html.parser')
content = soup_1.find('div', class_="wrapper hidden_navBar")
income = soup_1.find('table', class_="genTbl reportTbl")
print(content)
print(income)
CodePudding user response:
When I run your code and check out the html received from the request I can see the request has been blocked and a different page has been received hence why soup_1.find() is not returning anything.