In my Django project I use BeautifulSoup for web scraping.It works ut I can't print or slice it. When I try it give the error: (I'm doing this on the views.py)
"UnicodeEncodeError 'charmap' codec can't encode character '\u200e' in position 59: character maps to <undefined "
. How can I print x variable?
URL = link
user_agent = getRandomUserAgent()
headers = {"User-Agent": user_agent}
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
mylist = soup.find_all("td", class_="a-size-base prodDetAttrValue")
for x in mylist:
print(x)
CodePudding user response:
The desired data is not possible to pull by bs4 only because of dynamically loaded by JavaScript but grab using bs4 with selenium and It didn't throw UnicodeEncodeError
import time
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
table=driver.get('https://www.amazon.com/dp/B09N4ZL8NV')
driver.maximize_window()
time.sleep(3)
soup = BeautifulSoup(driver.page_source, 'html.parser')
mylist = soup.find_all("td", class_="a-size-base prodDetAttrValue")
for x in mylist:
print(x.get_text(strip=True))
Output:
13.3 Inches
1920 x 1200 pixels
1920 x 1200 Pixels
2.8 GHz core_i7_family
16 GB LPDDR4X
2.8 GHz
512 GB Flash Memory Solid State
Intel Iris Xe Graphics
Intel
Iris Xe Graphics
Bluetooth, 802.11ax
Lenovo
ThinkPad X13 Yoga Gen 2
20W80056US
PC
Windows 11 Pro
2.65 pounds
8.4 x 12 x 0.61 inches
8.4 x 12 x 0.61 inches
Black
Intel
1
DDR4 SDRAM
512 GB
No
B09N4ZL8NV
December 6, 2021