I am trying to run a script to simply find a few numbers in a website however it doesn't seem to want to let me past a certain point. In this script :
from requests_html import HTMLSession
import requests
url = "https://auction.chimpers.xyz/"
try:
s = HTMLSession()
r = s.get(url)
except requests.exceptions.RequestException as e:
print(e)
r.html.render(sleep=1)
title = r.html.find("title",first=True).text
print(title)
divs_found = r.html.find("div")
print(divs_found)
meta_desc = r.html.xpath('//*[@id="description-view"]/div',first=True)
print(meta_desc)
price = r.html.find(".m-complete-info div",first=True)
print(price)
The result of this gives :
Chimpers Genesis 100
[<Element 'div' id='app'>, <Element 'div' data-v-1d311e85='' id='m-connection' class=('manifold',)>, <Element 'div' id='description-view'>, <Element 'div' class=('manifold', 'm-complete-view')>, <Element 'div' data-v-cf8dbfe2='' class=('manifold', 'loading-screen')>, <Element 'div' class=('manifold-logo',)>]
<Element 'div' class=('manifold', 'm-complete-view')>
None
[Finished in 3.9s]
website : https://auction.chimpers.xyz/
and the information I am trying to find is here
clearly there is more html elements past the ones in the printed out in the list, however every time I try and access them even using r.html.xpath("//*[@id="description-view"]/div/div[2]/div/div[2]/span/span1") it will return None even though it is the copied xpath that i have got via the inspect in google
Any reason why this is and how i would go about it?
CodePudding user response:
I don't actually if it's even possible to do with requests_html
, but it is with selenium
.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
url = "https://auction.chimpers.xyz/"
class_names = ["m-price-label", "m-price-data"]
driver_options = Options()
driver_options.add_argument("--headless")
driver = webdriver.Chrome(options=driver_options)
driver.get(url)
results = {}
try:
for class_name in class_names:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, class_name)))
# Getting inner text of the html tag
results[class_name] = element.get_attribute("textContent")
finally:
driver.quit()
print(results)
Feel free to use another webdriver than Chrome