Home > other >  Web Scraping with python I can't print my variable
Web Scraping with python I can't print my variable

Time:07-02

In my Django project I use BeautifulSoup for web scraping.It works ut I can't print or slice it. When I try it give the error: (I'm doing this on the views.py)

"UnicodeEncodeError 'charmap' codec can't encode character '\u200e' in position 59: character maps to <undefined "

. How can I print x variable?

    URL = link
    user_agent = getRandomUserAgent()
    headers = {"User-Agent": user_agent}

    page = requests.get(URL, headers=headers)
    soup = BeautifulSoup(page.content, 'html.parser')



    mylist = soup.find_all("td", class_="a-size-base prodDetAttrValue")

    for x in mylist:
         print(x)

CodePudding user response:

The desired data is not possible to pull by bs4 only because of dynamically loaded by JavaScript but grab using bs4 with selenium and It didn't throw UnicodeEncodeError

import time
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
table=driver.get('https://www.amazon.com/dp/B09N4ZL8NV')
driver.maximize_window()
time.sleep(3)
soup = BeautifulSoup(driver.page_source, 'html.parser')
mylist = soup.find_all("td", class_="a-size-base prodDetAttrValue")

for x in mylist:
    print(x.get_text(strip=True))

Output:

13.3 Inches
‎1920 x 1200 pixels
‎1920 x 1200 Pixels
‎2.8 GHz core_i7_family
‎16 GB LPDDR4X
‎2.8 GHz
‎512 GB Flash Memory Solid State
‎Intel Iris Xe Graphics
‎Intel
‎Iris Xe Graphics
‎Bluetooth, 802.11ax
‎Lenovo
‎ThinkPad X13 Yoga Gen 2
‎20W80056US
‎PC
‎Windows 11 Pro
‎2.65 pounds
‎8.4 x 12 x 0.61 inches
‎8.4 x 12 x 0.61 inches
‎Black
‎Intel
‎1
‎DDR4 SDRAM
‎512 GB
‎No
B09N4ZL8NV
December 6, 2021
 
  • Related