Home > Net >  How to run window.stop after certain time in puppeteer
How to run window.stop after certain time in puppeteer

Time:05-18

I want to web scrap a site but the problem with that site is this, content in the site load in 1s but the loader in the navbar kept loading for 30 to 1min so my puppeteer scrapper kept waiting for the loader in the navbar to stop? Is there any way to run window.stop() after a certain timeout

GITHUB

    const checkBook = async () => {
        await page.goto(`https://wattpad.com/story/${bookid}`, {
            waitUntil: 'domcontentloaded',
        });
        const is404 = await page.$('#story-404-wrapper');
        if (is404) {
            socket.emit('error', {
                message: 'Story not found',
            });
            await browser.close();
            return {
                error: true,
            };
        }
        storyLastUpdated = await page
            .$eval(
                '.table-of-contents__last-updated strong',
                (ele: any) => ele.textContent,
            )
            .then((date: string) => getDate(date));
    };

CodePudding user response:

You could strip the

 waitUntil: 'domcontentloaded',

in favor of a timeout as documented here https://github.com/puppeteer/puppeteer/blob/v14.1.0/docs/api.md#pagegotourl-options

or set the timeout to zero and instead use one of the page.waitFor... like this

await page.waitForTimeout(30000);

CodePudding user response:

Similar approach to Marcel's answer. The following will do the job:

page.goto(url)
await page.waitForTimeout(1000)
await page.evaluate(() => window.stop())
// your scraper script goes here
await browser.close()

Notes:

  • page.goto() is NOT awaited, so you save time compared to waiting until DOMContentLoaded or Load events...
  • ...if the goto was not awaited you need to make sure your script can start the work with the DOM. You can either use page.waitForTimeout() or page.waitForSelector().
  • you have to execute window.stop() within page.evaluate() so you can avoid this kind of error: Error: Navigation failed because browser has disconnected!
  • Related