Home > Net >  How to export scraped items as a list of dictionaries in Scrapy
How to export scraped items as a list of dictionaries in Scrapy

Time:12-09

I made a Scrapy code that has 4 crawlers scraping from 4 different E-commerce websites. For each crawler, I want to output 5 products with the lowest prices from each website and export them into a single CSV file.

Right now, my main code looks like this:

process = CrawlerProcess()
process.crawl(Crawler1)
process.crawl(Crawler2)
process.crawl(Crawler3)
process.crawl(Crawler4)
process.start()

I want each crawler to return a list of dictionaries so that I can iterate through it with a for loop and compare the prices.

Do I need to use Scrapy Pipeline to do this? How can I make it so that Scrapy returns a list of scraped items(which is in a dictionary) instead of just exporting them as a file?

CodePudding user response:

Here's an example with some spider from another post, I passed the spider name to the function but you can tweak it to your needs:

from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from scrapy.signalmanager import dispatcher
from scrapy import signals


def spider_output(spider):
    output = []

    def get_output(item):
        output.append(item)

    dispatcher.connect(get_output, signal=signals.item_scraped)

    settings = get_project_settings()
    settings['USER_AGENT'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
    process = CrawlerProcess(settings)
    process.crawl(spider)
    process.start()
    return output


if __name__ == "__main__":
    spider = 'vdsc'
    print(spider_output(spider))
  • Related