I am trying to scrape the images from a personal imgur gallery: https://imgur.com/a/FIR1BL1 so I can then format them and prepare them for linking to my website. I want a list of all the image links, but for some reason I can't get any. I also tried with a CSS selector but no luck. I suspect it might be because they are too deeply nested. Also I don't have much experience with scraping.
This is what I came up with using Python and BeautifulSoup:
import requests
from bs4 import BeautifulSoup
# Make a GET request to the website
r = requests.get("https://imgur.com/a/FIR1BL1")
# Parse the HTML content
soup = BeautifulSoup(r.content, 'html.parser')
# Find the element with tag "div" and class "PostContent-imageWrapper-rounded"
div = soup.find_all("div", class_="PostContent-imageWrapper-rounded")
if div:
# Find all the "img" elements inside the div
img_tags = div.find_all('img')
# Print the src attribute of each img element
for img in img_tags:
print(img['src'])
else:
print("Div not found")
CodePudding user response:
You can try to use their API:
import requests
# FIR1BL1 is the album name
url = "https://api.imgur.com/post/v1/albums/FIR1BL1?client_id=546c25a59c58ad7&include=media"
data = requests.get(url).json()
for m in data['media']:
print(m['url'])
Prints:
https://i.imgur.com/q4UuhEq.jpeg
https://i.imgur.com/WFVRr9Q.jpeg
https://i.imgur.com/QSl0OpM.jpeg
https://i.imgur.com/0yKgw0Y.jpeg
https://i.imgur.com/BV2JfUw.jpeg
https://i.imgur.com/hITF8Y9.jpeg
https://i.imgur.com/HxQDu52.jpeg
https://i.imgur.com/S13WUFn.jpeg
https://i.imgur.com/MDEN7G6.jpeg
https://i.imgur.com/HNuWMOw.jpeg
CodePudding user response:
You are not finding them because they are not there, The images are loaded from the imgur api. to see the request is loading them:
- Open a new tab
- Open developer tools and go to network tab
- open your imgur link in the tab (https://imgur.com/a/FIR1BL1 is the one you have)
- use the search to find this request
https://api.imgur.com/post/v1/albums/FIR1BL1
or something similar - This request has the data you looking for try to reconstruct something similar and use request.json() to parse it