Home > Software design >  Web scraping - Beautiful Soup
Web scraping - Beautiful Soup

Time:07-05

I am trying to practice Webscraping on a website. However soup does not return the complete html content of the website. I am not sure why this is happening

Request help - this is the website:

res = requests.get("https://www.woodmac.com/")
soup = bs4.BeautifulSoup(res.text,"lxml")

CodePudding user response:

You can do it this way

url = "https://www.woodmac.com/"
page = urllib.urlopen(url).read()
soup = BeautifulSoup(page)
print(soup)

CodePudding user response:

The page you would like to scrape has some elements loaded dynamically using javascript. Therefore, it depends what elements you need. If it's static, then the way you are using is fine; if it's dynamic, try to use selenium or puppeteer or playwright.

Below is a selenium example(using Chrome):

from selenium import webdriver
from selenium.webdriver.chrome.service import Service

service = Service("your_driver")
driver = webdriver.Chrome(service=service, options=options)
driver.get("https://www.woodmac.com/")

// get element
driver.find_element(BY.XPATH, "XPATH")
  • Related