Home > Software design >  Selector in Puppeteer only returns one element
Selector in Puppeteer only returns one element

Time:01-24

I am creating an API using Puppeteer. The goal is to get data from football games to create a mobile app.

I made a script using Puppeteer. It's working and gets the data that I want to. The problem is that I want to get the data of all games in the page, but it only returns the data of one game.

The site that I am using to request is https://www.flashscore.com.br.

This is the service file:

import puppeteer from "puppeteer";

class NextGamesService {

    async execute() {

        const browser = await puppeteer.launch({ headless: false });
        const page = await browser.newPage();
        await page.goto('https://www.flashscore.com.br');

        const games = await page.$$('#live-table > section > div > div');

        const game1 = []

        for (const game of games) {


            const time = await page.evaluate((el) => el.querySelector('#g_1_IcYs8jIk > div.event__time')?.textContent, game)

            const home = await page.evaluate((el) => el.querySelector('div.event__participant.event__participant--home')?.textContent, game)

            const away = await page.evaluate((el) => el.querySelector('div.event__participant.event__participant--away')?.textContent, game)

            const league = await page.evaluate((el) => el.querySelector('div.icon--flag.event__title.fl_81 > div > span.event__title--name')?.textContent, game)

            game1.push({ time, home, away, league});
        }

        return ({game1})
    }
}

export { NextGamesService }

This is the controller:

import { NextGamesService } from "./nextGamesService";
import { Request, Response } from "express";

class NextGamesController {

    async handle(req: Request, res: Response) {

        const nextGamesService =new NextGamesService();

        const games = await nextGamesService.execute()

        return res.json(games)
    }
}

export {NextGamesController}

The JSON response I get:

{
    "game1": [
        {
            "time": "11:30",
            "home": "Dortmund",
            "away": "Augsburg",
            "league": "Bundesliga"
        }
    ]
}

CodePudding user response:

Your selector grabs the container for the events, not the events themselves. .event__match is the container for a game.

Some events don't have times, for example, if they're live currently, so you can replace those with .event__stage if you want.

Since the page load is slow, I'm blocking some resources to improve speed a bit.

Don't forget to handle errors and close your browser properly to avoid a memory leak. execute should probably be a static method, but I usually avoid classes and abstractions with Puppeteer until I have working code.

const puppeteer = require("puppeteer"); // ^19.1.0

const url = "<Your URL>";

let browser;
(async () => {
  browser = await puppeteer.launch();
  const [page] = await browser.pages();
  await page.setRequestInterception(true);
  const blocked = ["image", "font", "stylesheet"];
  page.on("request", req => {
    if (!req.url().includes("flashscore") ||
        blocked.includes(req.resourceType())) {
      req.abort();
    }
    else {
      req.continue();
    }
  });
  await page.goto(url, {waitUntil: "domcontentloaded"});
  await page.waitForSelector(".event__time");
  const events = await page.$$eval(".event__match", els =>
    els.map(e => {
      const text = x => e.querySelector(x)?.textContent.trim();
      return {
        time: text(".event__time") /*|| text(".event__stage")*/,
        home: text(".event__participant--home"),
        away: text(".event__participant--away"),
        league: text(".event__title--name")
      };
    })
  );
  console.log(events);
  console.log(events.length);
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());
  • Related