Home > Software engineering >  Loop increment not working over element [Puppeteer]
Loop increment not working over element [Puppeteer]

Time:11-17

I am working with Puppeteer and trying to get each item informations from Amazon.

I want get the 10 first result items from this article, but nothing can do

I already test all class on Chrome console, this returns what I want

(async() => {
   const browser = await puppeteer.launch({headless: false});
   const page = await browser.newPage();
   await page.setViewport({ width: 1366, height: 768});
   await page.goto("https://www.amazon.com/s?k=rtx 3070&ref=nb_sb_ss_pltr-ranker-24hours_2_2");

const itemList = [];

//class elements
const item = await page.$(".sg-col-20-of-24.s-result-item.s-asin.sg-col-0-of-12.sg-col-16-of-20.sg-col.s-widget-spacing-small.sg-col-12-of-16");
const item_title = await page.$(".a-size-medium.a-color-base.a-text-normal");
const item_img = await page.$(".s-image");
const item_price = await page.$(".a-price-whole");
const item_review = await page.$(".a-size-base.s-underline-text");

for (let i = 0; i < Math.min(9, item.length); i  ) { 
    const title = await (await item_title[i].getProperty('innerText')).jsonValue();
    const img = await (await item_img[i].getAttribute('src'));
    const price = await (await item_price[i].getProperty('innerText')).jsonValue();
    const review = await (await item_review[i].getProperty('innerText')).jsonValue();

    //scroll action
    if (i % 4 === 5) {
      await page.evaluate( () => {
        window.scrollBy(0, window.innerHeight);
        new Promise(function(resolve) { 
          setTimeout(resolve, 3000)
        });
      });

      itemList.push(title, img, price, review)
      console.log(itemList);
    }
    console.log(textOnTheDiv);
    //console.log(imgAuteur);
  }; 
})()

Do you think my problem comes from the increment, guys? Thanks

CodePudding user response:

Since it seems like you are trying to loop through all elements in your class, you should probably replace:

const item = await page.$(".sg-col-20-of-24.s-result-item.s-asin.sg-col-0-of-12.sg-col-16-of-20.sg-col.s-widget-spacing-small.sg-col-12-of-16");

with:

const item = await page.$$(".sg-col-20-of-24.s-result-item.s-asin.sg-col-0-of-12.sg-col-16-of-20.sg-col.s-widget-spacing-small.sg-col-12-of-16");

as page.$ only selecta the first option while page.$$ selects all elements.

See More HERE

CodePudding user response:

page.$ retrieves one element. Use page.$$ to select multiple. That said, I prefer not to assume all of these "parallel arrays" have the same length. Instead, I'd select the root elements, then query for all of the child elements within each parent container.

The first 10 elements are already on the page in the static HTML, so there's no need to scroll.

Perhaps try something like:

const puppeteer = require("puppeteer"); // ^19.0.0

const url = "<your URL>";

let browser;
(async () => {
  browser = await puppeteer.launch({headless: false});
  const [page] = await browser.pages();
  await page.goto(url, {waitUntil: "domcontentloaded"});
  const data = await page.$$eval(
    ".s-result-item.s-asin.sg-col-0-of-12.sg-col-16-of-20",
    els => els.map(el => {
      const text = s => el.querySelector(s)?.textContent;
      return {
        title: text(".a-size-medium.a-color-base.a-text-normal"),
        img: el.querySelector(".s-image")?.src,
        price: text(".a-price-whole"),
        review: text(".a-size-base.s-underline-text"),
      };
    })
  );
  console.log(data);
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());

You might need to tweak your review selector--I'm not sure what the expectation was for this.

  • Related