Home > Mobile >  Google Apps Script - how can I LOAD pdf?
Google Apps Script - how can I LOAD pdf?

Time:01-30

Just a simple question as titled; I have a pdf file which only contains text then I want to load it and extract text.

CodePudding user response:

I believe your goal is as follows.

  • You want to retrieve the text data from PDF data including only texts using Google Apps Script.

In this case, how about the following flow?

  1. Convert PDF to Google Document using Drive API as a temporal file.
  2. Export text from the created Google Document.

When this is reflected in a script, it becomes as follows.

Sample script:

In this sample, Drive API is used. So, before you test this script, please enable Drive API at Advanced Google services.

function myFunction() {
  const fileId = "###"; // Please set the file ID of PDF file on Google Drive.

  // Convert PDF to Google Document.
  const docId = Drive.Files.copy({title: "temp", mimeType: MimeType.GOOGLE_DOCS}, fileId).id;
  
  // Retrieve text from Google Document.
  const text = DocumentApp.openById(docId).getBody().getText();

  // If you want to remove the template Google Document, please run this script.
  // Drive.Files.remove(docId);

  console.log(text); // You can see the retrieved text in the log.

  // DriveApp.createFile("sample.txt", text); // If you want to save the text as a file, please use this line.
}

References:

  • Related