Home > Back-end >  scrapy get image size without downloading
scrapy get image size without downloading

Time:11-02

I want to get image size without download, is it possible?

image1 url:https://koctas-img.mncdn.com/mnresize/600/600/productimages/1000599303/1000599303_1_MC/8843182866482_1663925809606.jpg

def parse_product(self, response):

    images = response.css(".swiper-slide::attr(data-large)").getall()
    image1 = images[0]
    image_size=yield Request(image1, method="HEAD", callback=self.callback)

CodePudding user response:

You can use the HEAD method.

import scrapy


class ExampleSpider(scrapy.Spider):
    name = 'example_spider'

    def start_requests(self):
        images_urls = [
            'http://wallpapercave.com/wp/wp1809904.jpg',
            'https://i2.wp.com/www.otakutale.com/wp-content/uploads/2015/10/One-Punch-Man-Anime-Magazine-Visual-01.jpg',
            'https://thedeadtoons.com/wp-content/uploads/2020/06/One-Punch-Man-Season-3.jpg'
        ]
        for url in images_urls:
            yield scrapy.Request(url=url, method='HEAD')

    def parse(self, response, **kwargs):
        yield {
            'Content-Length': response.headers['Content-Length']
        }

Output:

[scrapy.core.scraper] DEBUG: Scraped from <200 https://thedeadtoons.com/wp-content/uploads/2020/06/One-Punch-Man-Season-3.jpg>
{'Content-Length': b'179681'}
[scrapy.core.scraper] DEBUG: Scraped from <200 https://i2.wp.com/www.otakutale.com/wp-content/uploads/2015/10/One-Punch-Man-Anime-Magazine-Visual-01.jpg>
{'Content-Length': b'1847153'}
[scrapy.core.scraper] DEBUG: Scraped from <200 https://wallpapercave.com/wp/wp1809904.jpg>
{'Content-Length': b'246144'}
  • Related