So, I was making a web scrapper for amazon just for a personnel project but I am stuck with a problem which is whenever I use get_text it shows attribute error but it works perfectly fine in the video I was referring to I am not getting it. Before I didn't use the header thingy but then it made me think that it might have fault,So I copied it as it is what the instructor wrote into the video tutorial.
import requests
from bs4 import BeautifulSoup
URL="https://www.amazon.in/dp/B074WZJ4MF/ref=redir_mobile_desktop?_encoding=UTF8&aaxitk=8bc2212eee66e1c1bdca057df16f612f&hsa_cr_id=2722802130102&pd_rd_plhdr=t&pd_rd_r=135b3806-45ad-402d-9df7-0f14d458f874&pd_rd_w=19o2S&pd_rd_wg=TBmei&ref_=sbx_be_s_sparkle_mcd_asin_0_title"
HEADERS={"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"}
def getprice():
page= requests.get(URL, headers=HEADERS)
# print(htmlcontent)
soup=BeautifulSoup(page.content,'html.parser')
# print(soup.prettify)
title=soup.find(id="productTitle").get_text()
print(title)
if __name__=="__main__":
getprice()
Here is the code: IDK why it's happening, let me show you the output too: The Output
The link is just a randomly taken link and the id taken is the Title of the product which I want it to display. Please help I searched whole internet for it.
CodePudding user response:
Your HEADERS
variable is a dictionary. You should set correctly the User-Agent key.
HEADERS={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"}
CodePudding user response:
If you are searching for easy way to get solution. you can scrape using selenium
Here is the code.
driver= webdriver.Chrome("C:/chromedriver.exe")
url='https://....."
driver.get(url)
price= driver.find_element_by_xpath("//span[@class='a-price a-text-price a-size-m
edium apexPriceToPay']").text