Home > Net >  How to download image from URL using beautiful soup in high quality?
How to download image from URL using beautiful soup in high quality?

Time:10-29

I am trying to download images using beautiful soup While Importing a list of URLs from .CSV file. Now I am getting results like below,

<img  src="backup/remote_2109image/008f3ef7-1da9-11ec-abad-88ae1db4aa6901.jpg" width="350height=616\"/>

In the below code, I am trying to get an image from URL that has the class 'pick'

Now, How Will I download this in a folder?

import csv
import requests
import os
import urllib
from bs4 import BeautifulSoup as bs

 with open('cat.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        imagesname = ' '.join(row)
        r = requests.get(imagesname)

        soup = bs(r.content, 'html.parser') 
        tables = soup.find_all('img', class_='pick')
    
        for image in tables:
            print(image)

CodePudding user response:

You might try this:

with open('cat.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        imagesname = ' '.join(row)
        r = requests.get(imagesname)

        soup = bs(r.content, 'html.parser') 
        tables = soup.find_all('img', class_='pick')

        inParsed = urllib.parse.urlparse(imagesname) # break down url
        rootUrl = f'{inParsed.scheme}://{inParsed.netloc}' # to get root
  
        for image in tables:
            imageUrl = urllib.parse.urljoin(rootUrl, imageUrl.get('src')) # add root to src
            saveImgAs = [u for u in imageUrl.split('/') if u][-1] # get name from link

            with open(saveImgAs, "wb") as f:
                f.write(requests.get(imageUrl).content) # download
                f.close()
            
            print(saveImgAs, image)

I'm not entirely sure about the formation of imageUrl nor of how consistent your image src values might be - if I had a few of your row values, I would have been able to run a few tests first, but hopefully this works

CodePudding user response:

I made some changes to download image from URL which is in CSV file

import csv
import requests
import os
import urllib
from bs4 import BeautifulSoup as bs

with open('cat.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        imagesname = ' '.join(row)
        r = requests.get(imagesname)

        soup = bs(r.content, 'html.parser')
        tables = soup.find_all('img', class_='pick')

        for image in tables:
            img_url = image.get('src').replace('\\', '/')
            real_url = "domain-name"   img_url
            
            img_name = str(img_url.split('/')[-1])
 
            urllib.request.urlretrieve(real_url, os.path.join(
                path, img_name)) 
  • Related