Home > Back-end >  Can't get the src attribute of an img using BeautifulSoup4
Can't get the src attribute of an img using BeautifulSoup4


I'm trying to scrape a website using BeautifulSoup. I'm trying to get the src attribute of an image but it just returns a completely different thing.

This is the img element: element

This is the code I'm using to scrape it (it returns other attributes perfectly fine so I'm sure I'm getting the right element):

pic = hrefs.a.div.div.span.img.get('src')

And the output of the pic variable is this:


CodePudding user response:

I tried to reproduce your example from the screenshot (not great to reproduce), did you try this?

from bs4 import BeautifulSoup

html = """
<div >
    <img alt="Air Jordan 1" src="https://cdn.myikas.com/images/blablabla"/>

soup = BeautifulSoup(html, 'html.parser')



CodePudding user response:

am using the following html document

<!DOCTYPE html>
    <title>Title of the document</title>
      <p>From wikipedia</p>
      <img src="
        //8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" />

the above is the html i am scraping

this is the image

<img src="
        //8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot" />

and this is the part we need to decode


so you need to get the string after the last comma from your 

src.split(",")[-1] # This will get the last sequence of text after a comma

Ok so now you have the string ( am not going to use your string because when i tried it i didnt see an image, i will use the string in the html code above ).

This is how you go about decoding it now in python 3.10.4

import base64

image_decoded = base64.b64decode( image_base64_string)

# Now its time to save the image
myfile = open("mygif.gif", "wb ")
# You should be able to see the file and open it
  • Related