Home > Blockchain >  Web scrapping can't grab the source of an image
Web scrapping can't grab the source of an image

Time:02-02

import requests
import bs4
res2 = requests.get("https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)")
soup = bs4.BeautifulSoup(res2.text,'lxml')
soup.select(".image")
computer = soup.select(".image")[0]
computer['class']
computer ['src']

When I run computer['class] I get back the class name ``But when I run computer['src'] I get the following error

KeyError Traceback (most recent call last) Input In [19], in <cell line: 1>() ----> 1 computer['src']

File C:\ProgramData\Anaconda3\lib\site-packages\bs4\element.py:1519, in Tag.getitem(self, key) 1516 def getitem(self, key): 1517 """tag[key] returns the value of the 'key' attribute for the Tag, 1518 and throws an exception if it's not there.""" -> 1519 return self.attrs[key]

KeyError: 'src'

CodePudding user response:

The error is because the key "src" is not found in the "attrs" dictionary of the element. To access the source URL of an image, you need to extract the "src" attribute from the "img" tag within the "computer" element. Try the following code:computer_img = computer.select("img")[0] computer_img["src"]

CodePudding user response:

It looks like you want the href. If that's the case then:

import requests
from bs4 import BeautifulSoup as BS

(r := requests.get('https://en.wikipedia.org/wiki/Deep_Blue_(chess_computer)')).raise_for_status()
soup = BS(r.text, 'lxml')

for image in soup.select('.image'):
    if href := image.get('href'):
        print(href)

Output:

/wiki/File:Deep_Blue.jpg
/wiki/File:Chess_Programming.svg
/wiki/File:Kasparov_Magath_1985_Hamburg-2.png
/wiki/File:One_of_Deep_Blue's_processors_(2586060990).jpg
/wiki/File:Chess.svg
/wiki/File:Chess.svg
  • Related