This code works and returns the single digit number that i want but its so slow and takes good 10 seconds to complete.I will be running this 4 times for my use so thats 40 seconds wasted every run. ` from selenium import webdriver from bs4 import BeautifulSoup
options = webdriver.FirefoxOptions()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)
driver.get('https://warframe.market/items/ivara_prime_blueprint')
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
price_element = soup.find('div', {'class': 'row order-row--Alcph'})
price2=price_element.find('div',{'class':'order-row__price--hn3HU'})
price = price2.text
print(int(price))
driver.close()`
This code on the other hand does not work. It returns None. ` import requests from bs4 import BeautifulSoup
url='https://warframe.market/items/ivara_prime_blueprint'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
price_element=soup.find('div', {'class': 'row order-row--Alcph'})
price2=price_element.find('div',{'class':'order-row__price--hn3HU'})
price = price2.text
print(int(price))`
First thought was to add user agent but still did not work. When I print(soup) it gives me html code but when i parse it further it stops and starts giving me None even tho its the same command like in selenium example.
CodePudding user response:
The data is loaded dynamically within a <script>
tag so Beautifulsoup doesn't see it (it doesn't render Javascript).
As an example, to get the data, you can use:
import json
import requests
from bs4 import BeautifulSoup
url = "https://warframe.market/items/ivara_prime_blueprint"
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
script_tag = soup.select_one("#application-state")
json_data = json.loads(script_tag.string)
# Uncomment the line below to see all the data
# from pprint import pprint
# pprint(json_data)
for data in json_data["payload"]["orders"]:
print(data["user"]["ingame_name"])
Prints:
Rogue_Monarch
Rappei
KentKoes
Tenno61189
spinifer14
Andyfr0nt
hollowberzinho
You can access the data as a dict
and acess the keys
/values
.
I'd recommend an online tool to view all the JSON since it's quite large.
See also
Parsing out specific values from JSON object in BeautifulSoup