Home > OS >  How to import specific text/line under headings in Google doc to Google sheet using google apps scri
How to import specific text/line under headings in Google doc to Google sheet using google apps scri

Time:09-27

I am trying to get specific text/line that is under the heading or subheading using the google apps script. Here is the screenshot of the Google doc:

Google sheets

I am looking to export the Question text/line from every heading to google sheets like this:

Google sheet

So far I have been able to get all the headings from the google doc by this code snippet:

function getHeadings() {
  var pars = DocumentApp.getActiveDocument().getBody().getParagraphs();
  var hdg = [];
  for(var i=0; i<pars.length; i  ) {
    var par = pars[i];
    hdg.push(par.getHeading());

  }
     Logger.log(hdg)
}

But I am unable to get the Question text under all these headings, kindly can you guide me in the right direction? (I apologize for my question formatting, I am a newbie on this website). Thank you

CodePudding user response:

Use getChild() and and push() functions

Setup

I used the following as setup:

test

Script

function getHeadings() {
  var doc = DocumentApp.getActiveDocument();
  var childNum = doc.getBody().getNumChildren();
  var ss = SpreadsheetApp.openById("<spreadsheetID>").getSheetByName("Sheet1"); //Change ID of Spreadsheet and Sheet name
  var out = [];
  for (var i = 0; i < childNum; i  = 2) {
    out.push([doc.getBody().getChild(i).asText().getText()]);
  }
  for (var i = 1, j = 0; i < childNum; i  = 2, j  ) {
    out[j].push(doc.getBody().getChild(i).asText().getText().replace(/Question(.*)- /gm, ""));
  }
  ss.getRange(ss.getLastRow()   1, 1, childNum / 2, 2).setValues(out); //edit range to proper coordinates
}

The script uses the getChild() function to get the texts in the document. Afterwards, the getNumChildren() will get the number of children to be used in the two separate for loops which will segregate the heading and the question texts. After the texts are identified, they are separated, paired, and then pushed (using the push() function) into arrays. Once the arrays are formed, the data will be appended to the spreadsheet using the setValues() function.

Output

output

Update

Strictly for this format: setup 2

You may use the following script as basis for your code:

function getHeadings() {
  var doc = DocumentApp.getActiveDocument();
  var childNum = doc.getBody().getNumChildren();
  var ss = SpreadsheetApp.openById("<SpreadsheetID>").getSheetByName("Sheet1"); //Change ID of Spreadsheet and Sheet name
  var header = [];
  var question = [];
  var out = [];
  for (var i = 0; i < childNum; i  ) {
    if (doc.getBody().getChild(i).asText().getText().match(/Header(.*)/gm)){
      header.push(doc.getBody().getChild(i).asText().getText());
    }
    else if (doc.getBody().getChild(i).asText().getText().match(/Question(.*)/gm)){
      question.push(doc.getBody().getChild(i).asText().getText());
    }
  }
  for (j=0; j<header.length; j  ) {
    out.push([header[j]]);
    out[j].push(question[j].replace(/Question(.*)- /gm, ""));
  }
  ss.getRange(ss.getLastRow()   1, 1, header.length, 2).setValues(out); //edit range to proper coordinates
}

References:

CodePudding user response:

There are a couple of assumptions for this script:

  • There is a finite number of header styles you're using (e.g. 'Header 1' and 'Header 2' in my example below)
  • Your questions always contain 'QUESTION:'
  • There is no other text apart from the headers and the question lines (but if there is, in principle it will be skipped)

In that case, the below code will work:

function getHeadings() {
  var pars = DocumentApp.getActiveDocument().getBody().getParagraphs();
  var currentHdg = "";
  var questions = [];
  for(var i=0; i<pars.length; i  ) {
    var par = pars[i];
    var text = par.getText();
    var hdg = par.getHeading();
    if (hdg.toString().indexOf("HEADING") > -1){
      currentHdg = text;
    }
    else if(text.indexOf("QUESTION:") > -1){
      questions.push([currentHdg,text.replace("QUESTION:","").trim()]);
    }
  }
  Logger.log(questions);
}

You can then format questions into the table output format you need.

Edit: I have updated my answer to cover all heading types.

  • Related