image_url = images[0]['src']-CodePudding

I was writing this code:

import requests
from bs4 import BeautifulSoup as bs
url = "https://keithgalli.github.io/web-scraping/webpage.html"
r = requests.get(url "webpage.html")

webpage = bs(r.content)

images = webpage.select("div.row div.column img")
image_url = images[0]['src']
full_url = url   image_url

img_data = requests.get(full_url).content
with open('late_combo.jpg', 'wb') as handler:
    handler.write(img_data)

but got this error: IndexError: list index out of range

CodePudding user response：

I think you made an error in the url, you already initialized it to url = "https://keithgalli.github.io/web-scraping/webpage.html", and then you are adding "webpage.html", and the page url = "https://keithgalli.github.io/web-scraping/webpage.html" "webpage.html" doesn't exist

Do this instead:

r = requests.get(url)

And it should work

CodePudding user response：

import requests
from bs4 import BeautifulSoup as bs
url = "https://keithgalli.github.io/web-scraping/webpage.html"
r = requests.get(url)

webpage = bs(r.content,'html.parser')

images = webpage.select("div.row div.column img")
image_url = images[0]['src']
full_url = 'https://keithgalli.github.io/web-scraping'   image_url

print(full_url)

Output:

https://keithgalli.github.io/web-scrapingimages/italy/lake_como.jpg