I was searching the documentation for it but wasn't able to find any related article.
I want to know if I can have several crawlers defined in a Apify project just like you can have several Spiders on Scrapy or if I have to create a new project for each new website that I like to crawl.
I would appreciate any response, thank you in advance!
CodePudding user response:
Yes, you can create as many crawler instances you need/want.
It's usually a good thing to separate things like sitemap crawling, using it's own CheerioCrawler
/BasicCrawler
instances with specific settings and an specific queue, then the full scraper using the desired crawler, like PuppeteerCrawler
, also using it's own queue if needed.
You can choose to run them in parallel with
await Promise.all([
crawler1.run(),
crawler2.run(),
]);
or one at a time, using
await crawler1.run();
await crawler2.run();
the caveat when using Promise.all
is that if they are reading/writing to the same key-value store, you might have some racing conditions. If they don't share any state, you should be good to go.