Home > Software design >  puppeteer express api data not updating
puppeteer express api data not updating

Time:08-29

I'm trying web scraping of stock market website,

Here is my code :-

const express = require("express");
const app = express();
const port = 3000;

const puppeteer = require("puppeteer");
const fs = require("fs/promises");
let items = []
let item = [];
let data;
let high;
let low;
let lastPrice;
let highs = [];
  let change = [];
  let value = [];
const fun = async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setUserAgent(
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"
  );

  await page.goto(
"https://www.moneycontrol.com/stocks/marketstats/bse-mostactive-stocks/bse-100-1/"  );
  data = await page.evaluate(() => {
    return Array.from(document.getElementsByClassName("gld13 disin")).map(
      (x) => x.innerText
    );
  });
  high = await page.evaluate(() => {
    return Array.from(
      document.querySelectorAll(
        ".bsr_table.hist_tbl_hm table tbody tr td[align='right'][width='175']"
      )
    ).map((x) => x.innerHTML);
  });
  low = await page.evaluate(() => {
    return Array.from(
      document.querySelectorAll(
        ".bsr_table.hist_tbl_hm table tbody tr td[align='right'][width='180']"
      )
    ).map((x) => x.innerHTML);
  });
  lastPrice = await page.evaluate(() => {
    return Array.from(
      document.querySelectorAll(
        ".bsr_table.hist_tbl_hm table tbody tr td[align='right'][width='185']"
      )
    ).map((x) => x.innerHTML);
  });

  
  
  for (let i = 0; i <= high.length; i = i   3) {
    highs.push(high[i]);
    change.push(high[i   1]);
    value.push(high[i   2]);
  }

  items.push(data,highs,low,lastPrice,change,value)


  for(let i=0; i<data.length;i  ){
    item.push({
      name : items[0][i],
      high : items[1][i],
      low : items[2][i],
      lastPrice : items[3][i],
      change : items[4][i],
      value : items[5][i]
    })
  }
  
  fs.writeFile(
    "data.json",
    JSON.stringify(item)
  );
  await browser.close();
};

fun()



app.get('/', (req, res) => {
  res.send('Hello World!')
})
app.get('/data', (req, res) => {
    res.status(200).json(item)
  })
app.listen(port, () => {
  console.log(`Example app listening on port ${port}`)
})

everything works fine when I use node index (Note :- index.js is my file name ) , localhost:3000 also works but problem with this is it does not update data .

for that i'm using nodemon and nodemon is re-running every second , resulting to update fs file data.json . data.json work completely fine and update data every second but in localhost api it does not work , it gives []

NOTE : Stock market data change every second

I want to show data in localhost and also Want to update only data which is updated not to update whole api .

Please help <3

CodePudding user response:

It's because you are just running it once so basically it wont update. You must put a setInterval:

setInterval(()=>fun(),10000)

This will run fun function each 10 seconds. You can change it. It's by milliseconds.

CodePudding user response:

From what I understand, you're using nodemon to re-run the file every second. At the start of the file you declared item=[], and then you made a function that let puppeteer run.

In the mean time that puppeteer is running, and is not finished, you fire up the app and make it start listening. This is not viable, since the app will always most likely start before the puppeteer function finishes (synchronous). By the time puppeteer finishes, it had most likely taken more than a second, meaning your app would've restarted, hence re-declaring item to be [], causing the issue.

For a fix, consider putting the whole data-collecting process in a separate file and use a setInterval to run fun(). Also consider using a database, where on every visit to the route you would access the database. Making nodemon constantly re-start an app is not very good practice.

  • Related