Home > Software design >  Populate Google Sheet with Google Docs in Drive that contain certain words
Populate Google Sheet with Google Docs in Drive that contain certain words

Time:11-11

Here's a summary of what I'm trying to achieve:

  1. Run Apps Script that searches all Google Docs in a Drive.
  2. If the script finds a document that contains at least one of three words (always in the header or footer), then it outputs this into a row within a Google Sheet, capturing the Google Doc ID, the Doc's name, a link to it, and the word that it found in the search (optionally, it'd be nice to include the owner name). Note that the header/footer will only ever contain one of a few words -- never a combination of them.

E.g., find all files that have "alice", "bob", or "carol" in the header or footer; log these entries as separate rows in a Google Sheet.

Google Doc ID | Alice Project Summary 2021 | Link to Doc | "alice"

I feel like I've found what I need within the Apps Script API, and found some similar search results, but can't quite piece this together.

Totally happy to expound more if that helps or if this is unclear!

Cheers for any pointers! (Even if it's just pointing out that my searching wasn't comprehensive enough.)

CodePudding user response:

I believe your goal is as follows.

  • You have a Google Spreadsheet. When you open the Google Spreadsheet, you want to run a script for searching all Google Document files in your Google Drive by checking the header and footer of the Document and putting the searched result to the specification sheet in the Google Spreadsheet.
  • You want to search Google Document that the values of "alice", "bob", or "carol" are included in the header and footer. And, you want to put the values on the Spreadsheet.
  • There are hundreds/thousands of Google Docs in your Google Drive.
  • You want to achieve this using Google Apps Script.

In this case, I would like to propose the following flow.

  1. Search the Google Document files that the values of "alice", "bob", or "carol" are included in the header and footer using the search query of Drive API.
    • By this method, I thought that the search cost might be able to be reduced a little.
  2. Check the header and footer of each Google Document. When the values of "alice", "bob", or "carol" are found, the Document ID, title, link, and searched value are put to an array.
  3. Put the array on Spreadsheet.

Sample script:

Please copy and paste the following script to the script editor of Google Spreadsheet and set the sheet name. And, in this sample script, Drive API is used. So please enable Drive API at Advanced Google services.

And also, please install OnOpen trigger to the function of myFunction as the installable trigger. By this, when you open Spreadsheet, the script is automatically run. In the case of the installable trigger, the maximum execution time is 6 minutes and Drive API can be used. On the other hand, the simple trigger is 30 seconds. Please be careful about this.

function myFunction() {
  const searchTexts = ["alice", "bob", "carol"];

  // 1. Search the Google Document files that the values of "alice", "bob", or "carol" are included in the header ahd footer using the search query of Drive API.
  const q = `mimeType = '${MimeType.GOOGLE_DOCS}' and (`   searchTexts.map(s => `fullText contains '${s}'`).join(" or ")   ")";
  let ar = [];
  let pageToken = "";
  do {
    const res = Drive.Files.list({q, pageToken, maxResults: 1000, fields: "items(id, title, alternateLink)"});
    if (res.items.length > 0) {
      ar = ar.concat(res.items);
    }
    pageToken = res.nextPageToken;
  } while (pageToken);

  // 2. Check the header and footer of each Google Document. When the values of "alice", "bob", or "carol" are found, the Document ID, title, link and searched value are put to an array.
  const values = ar.reduce((ar, {id, title, alternateLink}) => {
    const doc = DocumentApp.openById(id);
    let temp = [];
    const header = doc.getHeader();
    if (header) {
      const hText = header.getText();
      temp = temp.concat(searchTexts.filter(e => hText.includes(e)));
    }
    const footer = doc.getFooter();
    if (footer) {
      const fText = footer.getText();
      temp = temp.concat(searchTexts.filter(e => fText.includes(e)));
    }
    if (temp.length > 0) ar.push([id, title, alternateLink, temp.join(",")]);
    return ar;
  }, []);

  // 3. Put the array to Spreadsheet.
  const sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1"); // Please set the sheetname.
  sheet.getRange(sheet.getLastRow()   1, 1, values.length, values[0].length).setValues(values);
}

Note:

  • In my environment, I do have not hundreds/thousands of Google Docs. And, I cannot test your situation. So please test the above script in your actual situation.

References:

  • Related