I am trying to practice Webscraping on a website. However soup does not return the complete html content of the website. I am not sure why this is happening
Request help - this is the website:
res = requests.get("https://www.woodmac.com/")
soup = bs4.BeautifulSoup(res.text,"lxml")
CodePudding user response:
You can do it this way
url = "https://www.woodmac.com/"
page = urllib.urlopen(url).read()
soup = BeautifulSoup(page)
print(soup)
CodePudding user response:
The page you would like to scrape has some elements loaded dynamically using javascript. Therefore, it depends what elements you need. If it's static, then the way you are using is fine; if it's dynamic, try to use selenium or puppeteer or playwright.
Below is a selenium example(using Chrome):
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
service = Service("your_driver")
driver = webdriver.Chrome(service=service, options=options)
driver.get("https://www.woodmac.com/")
// get element
driver.find_element(BY.XPATH, "XPATH")