Home > Software engineering >  Cannot WebScrape text from image link
Cannot WebScrape text from image link

Time:11-18

I cannot able to webscrape rating percentage from amazon product page. I am getting only null values. Here is my code

from typing import Text
from bs4 import BeautifulSoup
import requests
import pandas as pd
from datetime import date
import os

url='https://www.amazon.in/dp/B09BJQCTMX?ref=myi_title_dp'
req = requests.get(url)
content=BeautifulSoup(req.content,"lxml")
data = content.findAll('a',class_='a-link-normal')
print(data)

I have provided correct class name but only null values in retrieved.

CodePudding user response:

Try to put headers inside request call and data will be present inside content

headers={"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36"}
url='https://www.amazon.in/dp/B09BJQCTMX?ref=myi_title_dp'
req = requests.get(url,headers=headers)
content=BeautifulSoup(req.content,"lxml")

Now use appropriate class to find data

text_data=content.find("span",class_="a-list-item").get_text(strip=True)
href_data=content.find("span",class_="a-list-item").find("a")['href']

Image:

enter image description here

  • Related