Home > Enterprise >  Issue with .eval$$() in puppeteer
Issue with .eval$$() in puppeteer

Time:10-17

I am attempting to scrape data from https://monkeytype.com/ like this:

const wordsSelector = await page.waitForSelector('#words');
console.log(await wordsSelector.$eval('div', words => words.innerHTML));

This works as expected returning the following (this string spells interest):

<letter>i</letter><letter>n</letter><letter>t</letter><letter>e</letter><letter>r</letter><letter>e</letter><letter>s</letter><letter>t</letter>

However there are multiple divs with similar innerHTML that I would like to scrape from #words I was under the impression that simply replacing .$eval with .$$eval was the correct way to do this, however when I make the change like this:

const wordsSelector = await page.waitForSelector('#words');
console.log(await wordsSelector.$$eval('div', words => words.innerHTML));

words becomes an empty array and words.innerHTML is obviously undefined.

Am I using .$$eval incorrectly?

CodePudding user response:

When using .$$eval(), the array of elements passed to the callback is typically mapped over:

const puppeteer = require("puppeteer"); // ^19.0.0

let browser;
(async () => {
  browser = await puppeteer.launch({headless: true});
  const [page] = await browser.pages();
  const url = "https://monkeytype.com/";
  await page.goto(url, {waitUntil: "domcontentloaded"});
  const $ = (...args) => page.waitForSelector(...args);
  await (await $(".rejectAll")).click();
  await $("#words .word.active");
  const wordsEl = await $("#words");
  const words = await wordsEl.$$eval(".word", els =>
    els.map(el => el.innerHTML)
  );
  console.log(words);
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close())
;

I'm not sure the HTML is that useful here, though. I'd suggest .textContent. If your goal is to type the words into the input, you might try something like:

const puppeteer = require("puppeteer"); // ^19.0.0

let browser;
(async () => {
  browser = await puppeteer.launch({headless: true});
  const [page] = await browser.pages();
  const url = "https://monkeytype.com/";
  await page.setRequestInterception(true);
  const allowed = [
    "https://monkeytype.com",
    "https://www.monkeytype.com",
    "https://api.monkeytype.com",
    "https://fonts.google",
  ];
  page.on("request", request => {
    if (allowed.some(e => request.url().startsWith(e))) {
      request.continue();
    }
    else {
      request.abort();
    }
  });
  await page.goto(url, {waitUntil: "domcontentloaded"});
  const $ = (...args) => page.waitForSelector(...args);
  await (await $(".rejectAll")).click();
  await $("#words .word.active");
  const words = await page.$("#words");

  try {
    for (;;) {
      const word = await words.$eval(".word.active", el =>
        el.textContent.trim()
      );
      await words.type(word   " ");
    }
  }
  catch (err) {}

  const results = await $("#result");
  await results.evaluate(el => el.scrollIntoView());
  await results.screenshot({path: "typing-results.png"});
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close())
;

Optimization is left as an exercise here. See also this answer for another typing test, which is a harder page to automate for various subtle reasons.

  • Related