Home > database >  bs4 does not return full HTML
bs4 does not return full HTML

Time:11-01

I am trying to get some information from a website using bs4 and requests.

the URL is: https://www.element14.com/community/community/design-challenges/in-the-air-design-challenge/blog/2014/10/26/firecracker-analyzer-index

I am trying to get to a specific div:

<div id="jive-comment-tabs" xmlns="http://www.w3.org/1999/html"> ..... </div>

however when I use the following code:

import requests
from bs4 import BeautifulSoup


URL = "https://www.element14.com/community/community/design-challenges/in-the-air-design-challenge/blog/2014/10/26/firecracker-analyzer-index"            
page = requests.get(URL)
soup = BeautifulSoup(page.content, "lxml")
print(soup.find('div', {'class': 'j-comment-wrapper'}))

I get None as result and I know for a fact that it is on the web page. I tried most solutions on the internet, but none of them helped me. Any ideas?

CodePudding user response:

What happens?

Website is serving these part of content dynamically, so you wont get it with request in that way.

Alternativ approach

Try to use selenium, it will render the page and you will get your results.

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome('YOUR PATH TO CHROMEDRIVER')
driver.get('https://www.element14.com/community/community/design-challenges/in-the-air-design-challenge/blog/2014/10/26/firecracker-analyzer-index')

soup=BeautifulSoup(driver.page_source, 'html.parser')

soup.find('div', {'class': 'j-comment-wrapper'})
  • Related