Home > database >  Puppeteer, awaiting a selector, and returning data from within
Puppeteer, awaiting a selector, and returning data from within

Time:05-07

I am loading a page, intercepting its requests, and when a certain element shows up I stop loading and extract the data I need...

Here is the problem that I am faced with.

When simplified the code looks actually something like this:

async function loadPage()
    {
        var contentLoaded = false;
        var content;
        
        //now i say when element shows up, do something
        // it is page.waitForSelector but for simplicity, i use a timeout
        // because the problem is the same
        //i set it to "show up" in 10 seconds here.
        //when it shows up, it sets the content to 100 (extracts the content i want)
        //and stores it..
        
        setTimeout(()=>{
            content = 100;
            contentLoaded = true;
        },10000)



        //Here i have a function that loads the page
        //Intercepts request and handles them
        //Until content is loaded
        
        page.on('request', req =>{
             if(!contentLoaded)
             {
                 // keep loading page
             }
          })  
       

        // this is the piece of code i would like to not run,
        // UNTIL i either get the data, or a timeout error
        // from page.waitForSelector...
        
        //but javascript will run it if it's not busy with the
        //loading function above...
        
        // In 10 seconds the content shows 
        // and it's stored in DATA, but this piece of code has
        // already finished by the time that is done...
        // and it returns false...
        
        if(contentLoaded)
            {return content}
        else
            {return false}

    }

var x = loadPage();
x.then(console.log); //should log the data or false if error occured

Thank you all for taking the time to read this and help out, I'm a novice so any feedback or even reading material is welcome if you think there is something I'm not fully understanding

CodePudding user response:

Solved

Simple explanation:

Here is what I was trying to accomplish:

  1. Intercept page requests so that I can decide what not to load, and speedup loading
  2. Once an element shows up on the page, i want to extract some data and return it.

I was trying to return it like this: (note, all the browser and error handling will be left out in these since it would just clutter the explanation)

var data = loadPage(url);

async function loadPage(URL)
    {
     var data;

     page.waitForSelector(
         var x = //page.evaluate returns data to x...
         data = x;
     )
    return data;
    }

Which doesn't work since return runs immediately but waitForSelector runs later, so we always return undefined...

The correct way of doing it, or rather the way it works for me is to return the whole promise, and then extract the data...

var data = loadPage(url);
data.then(//do what needs to be done with the data);  

async function loadPage(URL)
    {
    var data = page.waitForSelector(
         var x = //page.evaluate returns data to x...
         data = x;
     )
    return data; // we return data as a promise
    }

I hope it's a solid enough explanation, if someone needs to see the whole deal, I could edit the question and place the whole code there...

CodePudding user response:

This has worked for me, this parses and returns the values requested as data. I do not know every value it can parse but I listed the ones I know work below the code.

await page.querySelector(`.classname`).value
await page.waitForSelector(`.classname`)
.then(() => {
    contentLoaded = true;
(async function() {
  const content = await page.evaluate(() => {
      if (contentLoaded) {
        return await page.querySelector(`.classname`).value
      } else {
        return false
      };

Some Examples

  1. .text
  2. .innerHtml
  3. .innerText
  4. .value

Lmk if this helped :)

  • Related