Home > database >  Puling 90,000 emails to Google Sheet
Puling 90,000 emails to Google Sheet

Time:06-10

I have used https://webapps.stackexchange.com/questions/160232/gmail-bulk-emails-manipulation and modified a bit to copy some 90,000 emails (from a free gmail account) to a sheet.

The script is working only once. After one cycle, the trigger becomes "disabled" with "Unknown reason".

I reduced the batch size, increased the time gap, but still it is getting the same error.

Where am I going wrong?


/**
 * Creates the first trigger to call batchArchiveEmail. 
 */
function init(){

    var Triggers = ScriptApp.getProjectTriggers();
    for (var p = 0; p < Triggers.length; p  ) {
      ScriptApp.deleteTrigger(Triggers[p])
    }



   ScriptApp
    .newTrigger('batchArchiveEmail')
    .timeBased()
    .after(60 * 1000)
    .create();
    console.log(`trigger created`)
}

/**
 * Archive emails by batches preventing controlling limiting the execution time and  
 * creating a trigger if there are still threads pending to be archived.
 */ 
function batchArchiveEmail(){
  const start = Date.now();
  /** 
   * Own execution time limit for the search and archiving operations to prevent an 
   * uncatchable error. As the execution time check is done in do..while condition there  
   * should be enough time to one search and archive operation and to create a trigger 
   * to start a new execution. 
   */ 
  const maxTime = 3 * 60 * 1000; // Instead of 25 use 3 for Google free accounts
  const batchSize = 50;
  let threads, elapsedTime;
  var ss=SpreadsheetApp.getActive();
  var sheet=ss.getSheetByName("Sheet1");
  /** Search and archive threads, then repeat until the search returns 0 threads or the 
   * maxTime is reached
   */ 
  var ms=[];
  do {

    threads = GmailApp.search('label:inbox before:2022/5/1');
    for (let j = 0; j < threads.length; j  = batchSize) {
      //below code added by me
      ms=[];
      var messages = threads[j].getMessages();
      for (var m = 0; m < messages.length; m  ) {
        var from = messages[m].getFrom(); //from field
        var mId = messages[m].getId();//id field to create the link later
        var supportStats = [];
        var to = messages[m].getTo();//to field
        var time = messages[m].getDate();//date field
        var subject = messages[m].getSubject();//subject field
        var body=messages[m].getPlainBody();
        var tel=[];
        tel = body.match(/[\ ]?\d{10}|\(\d{3}\)\s?-\d{6}|\d{3}\s-\d{3}\s\d{4}/);
        supportStats.push(from);
        supportStats.push(to);
        supportStats.push(time);
        supportStats.push(subject);
        supportStats.push('https://mail.google.com/mail/u/0/#inbox/' mId); //build the URL to the email
        supportStats.push(body);
        if (tel){supportStats.push(tel[0])} else {supportStats.push("")};
        ms.push(supportStats);
      }
    var lr=sheet.getLastRow();
    sheet.getRange(lr 1,1,ms.length,7).setValues(ms);
    //above code added by me
    GmailApp.moveThreadsToArchive(threads.slice(j, j   batchSize));
    };
    /**
     * Used to prevent to have too many calls in a short time, might not be 
     * necessary with a large enough batchSize
     */
    Utilities.sleep(`2000`); 
    elapsedTime = Date.now() - start;
  } while (threads.length > 0 &&  elapsedTime < maxTime);
  if(threads.length > 0){
    /** Delete the last trigger */

    var Triggers = ScriptApp.getProjectTriggers();
    for (var p = 0; p < Triggers.length; p  ) {
      ScriptApp.deleteTrigger(Triggers[p])
    }


    //deleteTriggers();

    /** Create a one-time new trigger */
    ScriptApp
    .newTrigger('batchArchiveEmail')
    .timeBased()
    .after(300 * 1000)
    .create();
    console.log(`next trigger created`)
  } else {
    /** Delete the last trigger */
    var Triggers = ScriptApp.getProjectTriggers();
    for (var p = 0; p < Triggers.length; p  ) {
      ScriptApp.deleteTrigger(Triggers[p])
    }
    console.log(`No more threads to process`);
  }
}


CodePudding user response:

The issue of This trigger has been disable for an unknown reason is reported here, and seems to be related to the V8 runtime. Click on the 1 to let Google know that you are also affected by the issue.

Deactivating the Project Settings > General Settings > Enable Chrome V8 runtime seems to fix the issue for some users. Same as changing the appsscript.json key runtimeVersion from:

  "runtimeVersion": "V8"

to

  "runtimeVersion": "DEPRECATED_ES5"

CodePudding user response:

Issue and workaround:

When I tested your script, I confirmed the same situation with you. In this case, unfortunately, even when the next trigger time is increased, the issue couldn't be avoided.

In this answer, I would like to introduce a workaround that I have already posted at this thread. Unfortunately, my answer was not useful for that thread. But, fortunately, when I tested your script, I confirmed that this workaround can be used.

About The newTrigger is never running., if you are using V8 runtime at the script editor and the installed trigger doesn't work, I'm worried that this might be the bug. Ref1 Ref2

But, fortunately, in the current stage, there is a workaround for removing this bug. It's to use Web Apps. Ref I had had the same situation before. The flow of this workaround is as follows.

  1. Run the script you want to run.
  2. When the time-driven trigger is installed, that is installed using the Web Apps.
    • This is the important point.

In this case, the time-driven trigger installed by the script is run. By this workaround, your script works with enabling V8 runtime. When your script is modified, it becomes as follows.

Usage:

In this workaround, Web Apps is used. So, please do the following flow.

1. Deploy Web Apps.

  1. On the script editor, at the top right of the script editor, please click "click Deploy" -> "New deployment".
  2. Please click "Select type" -> "Web App".
  3. Please input the information about the Web App in the fields under "Deployment configuration".
  4. Please select "Me" for "Execute as".
  5. Please select "Only myself" for "Who has access".
  6. Please click "Deploy" button. And, click "Done" button.
  7. On the script editor, at the top right of the script editor, please click "click Deploy" -> "Test deployments".
  8. Please copy the URL of Web Apps.
    • In this case, the URL is like https://script.google.com/macros/s/###/dev.

The detailed information can be seen at the official document and my report.

2. Modified script.

Please modify your script as follows. Please set your Web Apps URL to webAppsUrl.

const functionName = "batchArchiveEmail"; // This is the function name for executing.
const webAppsUrl = "https://script.google.com/macros/s/###/dev"; // Please set your Web Apps URL.

// When v8 runtime is used, when the trigger is set from the function executing by a trigger, the trigger is disabled. This is the recognized bug. But unfortunately, this has still not been resolved. (September 21, 2021)
// https://issuetracker.google.com/issues/150756612
// https://issuetracker.google.com/issues/142405165
const doGet = _ => {
  deleteTriggers();
  ScriptApp.newTrigger(functionName).timeBased().after(60000).create();
  return ContentService.createTextOutput();

  // DriveApp.getFiles(); // This is used for automatically detecting the scopes for requesting to Web Apps. Please don't remove this comment line.
};

const deleteTriggers = _ => {
  ScriptApp.getProjectTriggers().forEach(e => {
    if (e.getHandlerFunction() == functionName) {
      ScriptApp.deleteTrigger(e);
    }
  });
}

// Please run this function.
function init() {
  deleteTriggers();
  UrlFetchApp.fetch(webAppsUrl, { headers: { authorization: "Bearer "   ScriptApp.getOAuthToken() } });
  console.log(`trigger created`)
}

/**
 * Archive emails by batches preventing controlling limiting the execution time and  
 * creating a trigger if there are still threads pending to be archived.
 */
function batchArchiveEmail() {
  const start = Date.now();
  /** 
   * Own execution time limit for the search and archiving operations to prevent an 
   * uncatchable error. As the execution time check is done in do..while condition there  
   * should be enough time to one search and archive operation and to create a trigger 
   * to start a new execution. 
   */ 
  const maxTime = 3 * 60 * 1000; // Instead of 25 use 3 for Google free accounts
  const batchSize = 50;
  let threads, elapsedTime;
  var ss=SpreadsheetApp.getActive();
  var sheet=ss.getSheetByName("Sheet1");
  /** Search and archive threads, then repeat until the search returns 0 threads or the 
   * maxTime is reached
   */ 
  var ms=[];
  do {

    threads = GmailApp.search('label:inbox before:2022/5/1');
    for (let j = 0; j < threads.length; j  = batchSize) {
      //below code added by me
      ms=[];
      var messages = threads[j].getMessages();
      for (var m = 0; m < messages.length; m  ) {
        var from = messages[m].getFrom(); //from field
        var mId = messages[m].getId();//id field to create the link later
        var supportStats = [];
        var to = messages[m].getTo();//to field
        var time = messages[m].getDate();//date field
        var subject = messages[m].getSubject();//subject field
        var body=messages[m].getPlainBody();
        var tel=[];
        tel = body.match(/[\ ]?\d{10}|\(\d{3}\)\s?-\d{6}|\d{3}\s-\d{3}\s\d{4}/);
        supportStats.push(from);
        supportStats.push(to);
        supportStats.push(time);
        supportStats.push(subject);
        supportStats.push('https://mail.google.com/mail/u/0/#inbox/' mId); //build the URL to the email
        supportStats.push(body);
        if (tel){supportStats.push(tel[0])} else {supportStats.push("")};
        ms.push(supportStats);
      }
    var lr=sheet.getLastRow();
    sheet.getRange(lr 1,1,ms.length,7).setValues(ms);
    //above code added by me
    GmailApp.moveThreadsToArchive(threads.slice(j, j   batchSize));
    };
    /**
     * Used to prevent to have too many calls in a short time, might not be 
     * necessary with a large enough batchSize
     */
    Utilities.sleep(2000); 
    elapsedTime = Date.now() - start;
  } while (threads.length > 0 &&  elapsedTime < maxTime);

  if (threads.length > 0) {
    /** Delete the last trigger */
    deleteTriggers();

    /** Create a one-time new trigger */
    UrlFetchApp.fetch(webAppsUrl, { headers: { authorization: "Bearer "   ScriptApp.getOAuthToken() } });
    console.log(`next trigger created`)
  } else {
    /** Delete the last trigger */
    deleteTriggers();
    console.log(`No more threads to process`);
  }
}

3. Testing.

In the modified script, as the 1st run, please run init() with the script editor. By this, your script is run and install the time-driven trigger with Web Apps. And, the installed trigger is run automatically by the time-driven trigger.

Note:

  • In this modified script, it supposes that your function batchArchiveEmail() works fine. Please be careful about this.
  • If you disable the V8 runtime, I thought that the script might be worked without the above modification. But in that case, the process cost of the loop will become high. Ref By this, I would like to introduce this workaround.

Reference:

  • Related