Home > Enterprise >  Beautiful soup some errors
Beautiful soup some errors

Time:12-27

So, I was making a web scrapper for amazon just for a personnel project but I am stuck with a problem which is whenever I use get_text it shows attribute error but it works perfectly fine in the video I was referring to I am not getting it. Before I didn't use the header thingy but then it made me think that it might have fault,So I copied it as it is what the instructor wrote into the video tutorial.

import requests
from bs4 import BeautifulSoup
URL="https://www.amazon.in/dp/B074WZJ4MF/ref=redir_mobile_desktop?_encoding=UTF8&aaxitk=8bc2212eee66e1c1bdca057df16f612f&hsa_cr_id=2722802130102&pd_rd_plhdr=t&pd_rd_r=135b3806-45ad-402d-9df7-0f14d458f874&pd_rd_w=19o2S&pd_rd_wg=TBmei&ref_=sbx_be_s_sparkle_mcd_asin_0_title"
HEADERS={"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"}

def getprice():
    page= requests.get(URL, headers=HEADERS)

    # print(htmlcontent)
    soup=BeautifulSoup(page.content,'html.parser')
    # print(soup.prettify)
    title=soup.find(id="productTitle").get_text()
    
    print(title)

if __name__=="__main__":
    getprice()

Here is the code: IDK why it's happening, let me show you the output too: The Output

The link is just a randomly taken link and the id taken is the Title of the product which I want it to display. Please help I searched whole internet for it.

CodePudding user response:

Your HEADERS variable is a dictionary. You should set correctly the User-Agent key.

HEADERS={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36"}

CodePudding user response:

If you are searching for easy way to get solution. you can scrape using selenium

Here is the code.

driver= webdriver.Chrome("C:/chromedriver.exe")
url='https://....."
driver.get(url)
price= driver.find_element_by_xpath("//span[@class='a-price a-text-price a-size-m 
edium apexPriceToPay']").text
  • Related