Home > OS >  Trying to scrape image url's in Python using beautiful soup
Trying to scrape image url's in Python using beautiful soup

Time:08-29

I'm new to Python and need some help. I am trying to scrape the image urls from this site but can't seems to do so. I pull up all the html. Here is my code.

import requests
import pandas as pd
import urllib.parse
from bs4 import BeautifulSoup
import csv

baseurl = ('https://www.thewhiskyexchange.com/')

productlinks = []

for x in range(1,4):
    r = requests.get(f'https://www.thewhiskyexchange.com/c/316/campbeltown-single-malt-scotch-whisky?pg={x}')
    soup = BeautifulSoup(r.content, 'html.parser')
    tag = soup.find_all('ul',{'class':'product-grid__list'})
    
    for items in tag:
        for link in items.find_all('a', href=True):
            productlinks.append(baseurl   link['href'])
#print(len(productlinks))

for items in productlinks:
    r = requests.get(items)
    soup = BeautifulSoup(r.content, 'html.parser')

    name = soup.find('h1', class_='product-main__name').text.strip()
    price = soup.find('p', class_='product-action__price').text.strip()
    imgurl = soup.find('div', class_='product-main__image-container')
    print(imgurl)

And here is the piece of HTML I am trying to scrape from.

<div ><img src="https://img.thewhiskyexchange.com/480/gstob.non1.jpg" alt="Glen Scotia Double Cask Sherry Finish"  loading="lazy" width="3" height="4">

I would appreicate any help. Thanks

CodePudding user response:

You need to first select the image then get the src attribute. Try this:

imgurl = soup.find('div', class_='product-main__image-container').find('img')['src']

CodePudding user response:

I'm not sure if I fully understand what output you are looking for. But if you just want the img source URLs, this might work:

    # imgurl = soup.find('div', class_='product-main__image-container')
    imgurl = soup.find('img', class_='product-main__image')
    imgurl_attribute = imgurl['src']
    print(imgurl_attribute[:5])

#https://img.thewhiskyexchange.com/900/gstob.non1.jpg
#https://img.thewhiskyexchange.com/900/gstob.15yov1.jpg
#https://img.thewhiskyexchange.com/900/gstob.18yov1.jpg
#https://img.thewhiskyexchange.com/900/gstob.25yo.jpg
#https://img.thewhiskyexchange.com/900/sets_gst1.jpg
  • Related