Home > front end >  How to scrape icon link of image using mechanize gem
How to scrape icon link of image using mechanize gem

Time:11-23

I have a url where I have to scrape all images using mechanize gem, but some image url's are in rel=icon.

I have to get the image from this url:

<link rel="icon" href="https://mywebsite.com/wp-content/uploads/2021/10/cropped-favicon-32x32.png" sizes="32x32">

This is my code I tried which scrapes only images. How to get both working as one.

require 'mechanize'
url = "https://mywebsite.com/"

agent = Mechanize.new
page = agent.get(url)

page.images.each do |image|
  puts image #getting here all images here from image tag
end

CodePudding user response:

I looked over Mechanize Page Link but it returns only the anchors.

Tried it with xpath

page.xpath('//link[contains(@rel, "icon")]').each do |icon|
  p icon.attr('href')
end

And received:

"https://ownwebsite.com/wp-content/uploads/2021/10/cropped-favicon-32x32.png"
"https://ownwebsite.com/wp-content/uploads/2021/10/cropped-favicon-192x192.png" 
"https://ownwebsite.com/wp-content/uploads/2021/10/cropped-favicon-180x180.png"

Here is a Replit that returns all the images.

CodePudding user response:

page.search('link').each do |link|
      if link['href'].to_s.include?(".gif") or link['href'].to_s.include?(".png")  or link['href'].to_s.include?(".jpg")  or link['href'].to_s.include?(".jpeg")
      puts link['href']
      end
    end
  • Related