I was trying to track the price of a product using beautiful soup but whenever I try to run this code, I get a 6 digit code which I assume has something to do with recaptcha. I have tried numerous times, checked the headers, the url and the tags but nothing seems to work.
from bs4 import BeautifulSoup
import requests
from os import environ
import lxml
headers = {
"User-Agent": environ.get("User-Agent"),
"Accept-Language": environ.get("Accept-Language")
}
amazon_link_address = "https://www.amazon.in/Razer-Basilisk-Wired-
Gaming-RZ01-04000100-R3M1/dp/B097F8H1MC/?
_encoding=UTF8&pd_rd_w=6H9OF&content-id=amzn1.sym.1f592895-6b7a-4b03-
9d72-1a40ea8fbeca&pf_rd_p=1f592895-6b7a-4b03-9d72-1a40ea8fbeca&pf_rd_r=1K6KK6W05VTADEDXYM3C&pd_rd_wg=IobLb&pd_rd_r=9fcac35b
-b484-42bf-ba79-a6fdd803abf8&ref_=pd_gw_ci_mcx_mr_hp_atf_m"
response = requests.get(url=amazon_link_address, headers=headers)
soup = BeautifulSoup(response.content, features="lxml").prettify()
price = soup.find("a-price-whole")
print(price)
CodePudding user response:
The "a-price-whole" class in inside the tags so BS4 is not able to find it. This solution worked for me, I just changed your "find" to a "find_all" and made it scan through all of the spans until you find the class you are searching for then used the iterator.get_text() to print the price. Hope this helps!
soup = BeautifulSoup(response.content, features="lxml")
price = soup.find_all("span")
for i in price:
try:
if i['class'] == ['a-price-whole']:
itemPrice = f"${str(i.get_text())[:-1]}"
break
except KeyError:
continue
print(itemPrice)