Home > Blockchain >  Google Apps Script limitation to Extract data from web page using Cheerio Library
Google Apps Script limitation to Extract data from web page using Cheerio Library

Time:06-28

Continuing my previous enter image description here


This is the code snippet to get around it:

function CheckPageSpeed(url) {
 
  const apiKey = "###"; // Please set your API key.


  const apiEndpoint = `https://www.googleapis.com/pagespeedonline/v5/runPagespeed?key=${apiKey}&url=${encodeURIComponent(url)}&category=performance`;
  const strategy = ["mobile"];
  
  
  const res = UrlFetchApp.fetchAll(strategy.map(e => ({ url: `${apiEndpoint}&strategy=${e}`, muteHttpExceptions: true })));
  
  

  
  const values = res.reduce((o, r, i) => {
    if (r.getResponseCode() == 200) {
      const obj = JSON.parse(r.getContentText());
        o[strategy[i]] = obj.lighthouseResult.categories.performance.score * 100;
  
    } else {
      o[strategy[i]] = null;
    }
    return o;
  }, {});
 
  return values.mobile;
  
}

As I am using it in Google sheets as custom formula, sometimes it takes so much time that the sheet throws the following error: enter image description here


Is there any way that we can counter this error so that it starts calculating the score again instead of throwing an error? Thank you.

CodePudding user response:

Issue and workaround:

From your showing image, your error of Exceed maximum execution time and your updated script, in this case, it is considered that the execution time of the script is over 30 seconds. (In the current stage, the maximum execution time of the custom function is 30 seconds. Ref) In this case, when the error of Exceed maximum execution time occurs, unfortunately, this cannot be used as the trigger. And also, in the current stage, UrlFetchApp cannot be stopped over time. And, for example, even when all URLs are retrieved and each value is retrieved from the API, I'm not sure whether the processing time is over 6 minutes. I'm worried about this.

From the above situation, how about manually rerunning only the custom functions which occur the error?

Sample script:

Before you use this script, please enable Sheets API at Advanced Google services. How about executing this function by a button on Spreadsheet and/or the custom menu?

function reCalculation() {
  const sheetName = "Sheet1"; // Please set sheet name.
  const formula = "=CheckPageSpeed"; // Please set the function name of your custom function.
  const dummy = "=sample";
  
  const ss = SpreadsheetApp.getActiveSpreadsheet();
  const ssId = ss.getId();
  const sheet = ss.getSheetByName(sheetName);
  const sheetId = sheet.getSheetId();
  const values = sheet.getRange("B1:B"   sheet.getLastRow()).getDisplayValues();
  const requests = values.reduce((ar, [a], i) => {
    if (a == "#ERROR!") {
      ar.push({ findReplace: { range: { sheetId, startRowIndex: i, endRowIndex: i   1, startColumnIndex: 1, endColumnIndex: 2 }, find: `^${formula}`, replacement: dummy, includeFormulas: true, searchByRegex: true } }); // Modified
    }
    return ar;
  }, []);
  if (requests.length == 0) return;
  Sheets.Spreadsheets.batchUpdate({ requests }, ssId);
  SpreadsheetApp.flush();
  requests.forEach(r => {
    r.findReplace.find = dummy;
    r.findReplace.replacement = formula;
    r.findReplace.searchByRegex = false;
  });
  Sheets.Spreadsheets.batchUpdate({ requests }, ssId);
}
  • When this script is run, only the cells of #ERROR! in the column "B" are recalculated.

Note:

  • I thought that in this case, this function might be able to be executed by the time-driven trigger. But, in that case, it might affect the quotas (maximum execution time is 90 minutes/day) of the time-driven trigger. So, in this answer, I proposed to run this function using manual operation.

References:

Added:

For example, in your situation, how about directly requesting the API endpoint using fetchAll method? The sample script is as follows. In this case, the URLs are retrieved from the column "A" and the values are retrieved and put to the column "C" in your sample Spreadsheet.

Sample script:

Please set your API key. And, please run this script with the script editor. By this, the values are retrieved using the API.

function reCalculation2() {
  const apiKey = "###"; // Please set your API key.
  const sheetName = "Sheet1"; // Please set sheet name.

  const ss = SpreadsheetApp.getActiveSpreadsheet();
  const sheet = ss.getSheetByName(sheetName);
  const values = sheet.getRange("A2:A"   sheet.getLastRow()).getValues();
  const requests = values.map(([url]) => {
    const apiEndpoint = `https://www.googleapis.com/pagespeedonline/v5/runPagespeed?key=${apiKey}&url=${encodeURIComponent(url)}&category=performance&strategy=mobile`;
    return { url: apiEndpoint, muteHttpExceptions: true };
  });
  const res = UrlFetchApp.fetchAll(requests);
  const v = res.map(r => {
    if (r.getResponseCode() == 200) {
      const obj = JSON.parse(r.getContentText());
      return [obj.lighthouseResult.categories.performance.score * 100];
    }
    return [null];
  });
  sheet.getRange(2, 3, v.length).setValues(v);
}
  • In this case, fetchAll method is used. By this, I thought that the error of Exceeded maximum execution might be able to be avoided.
  • Related