Home > Blockchain >  How to sort sections of a GDocs document alphabetically?
How to sort sections of a GDocs document alphabetically?

Time:11-01

Starting point: I have a doc with sections separated by page breaks.

Goal: I want a Google Apps Script that scans through a doc and sorts the sections alphabetically (aka everything between a heading1 and a page break). The manual equivalent would be cut-pasting sections around until sorted.

To illustrate, here's an example document: initial state and desired end state.

So far, this is what I have:

A script that searches through a doc and identifies the headings with a given format (heading1). I could sort the headings alphabetically, but I don't know how to move whole sections around in the document.

function findHeading1s() {
  let doc = DocumentApp.getActiveDocument();//.openById('<doc-id>');
  var body = doc.getBody();

  // Define the search parameters.
  let searchType = DocumentApp.ElementType.PARAGRAPH;
  let searchHeading = DocumentApp.ParagraphHeading.HEADING1;
  let searchResult = null;

  let headers = [];
    
  // Search until the paragraph is found.
  while (searchResult = body.findElement(searchType, searchResult)) {
    let par = searchResult.getElement().asParagraph();
    let parText = par.getText();
    if (par.getHeading() == searchHeading && parText != "") {
      headers.push(parText);
      Logger.log(headers); //Headers in current order: [Zone, Template, Example]
    }
  }
  Logger.log(headers.sort()) //[Example, Template, Zone]
}

Any ideas how to move everything between a heading and the following pagebreak? I don't mind if the sorted end result is in a new document.

Is this doable with existing GAS capabilities? Tried to debug the body/paragraph elements to get some idea on how to solve this, but I only get the functions of the object, not the actual content. Not super intuitive.

Here are the steps I think are needed:

  1. Identify heading1 headings ✅
  2. Sort headings alphabetically ✅
  3. Find each heading in the doc
  4. Cut-paste section in the appropriate position of the doc (or append to a new doc in the correct order)

Thanks!

CodePudding user response:

Issue:

You want to sort a series of document sections according to the section's heading text.

Solution:

Assuming that each section starts with a heading, you can do the following:

  • Iterate through all the body elements and get info on which indexes does each section start and end (based on the location of the headings). Methods getNumChildren and getChild would be useful here. You should end up with an array of section objects, each containing the start and end indexes as well as the heading text.
  • Sort the section objects according to the heading text.
  • Iterate through the sections array, and append the children corresponding to each section indexes to the end of the document. Useful methods here are copy and the different append methods (e.g. appendParagraph). Here's a related question.
  • Please notice that only PARAGRAPH, TABLE and LIST_ITEM are appended on the sample below; if there are other element types present in your original content, you should add the corresponding condition to the code.
  • When appending the elements, it's important to use the original elements Attributes; otherwise, things like list bullets might not get appended (see this related question).
  • After the copied elements are appended, remove the original ones (here you can use removeFromParent).

Code sample:

function findHeading1s() {
  let doc = DocumentApp.getActiveDocument();
  var body = doc.getBody();
  const parType = DocumentApp.ElementType.PARAGRAPH;
  const headingType = DocumentApp.ParagraphHeading.HEADING1;
  const numChildren = body.getNumChildren();
  let sections = [];
  for (let i = 0; i < numChildren; i  ) {
    const child = body.getChild(i);
    let section;
    if (child.getType() === parType) {
      const par = child.asParagraph();
      const parText = par.getText();
      if (par.getHeading() === headingType && parText != "") {
        section = {
          startIndex: i,
          headingText: parText
        }
        if (sections.length) sections[sections.length - 1]["endIndex"] = i - 1;
        sections.push(section);
      }
    }
  }
  const lastSection = sections.find(section => !section["endIndex"]);
  if (lastSection) lastSection["endIndex"] = numChildren - 1;
  sections.sort((a,b) => a.headingText.localeCompare(b.headingText));
  sections.forEach(section => {
    const { startIndex, endIndex } = section;
    for (let i = startIndex; i <= endIndex; i  ) {
      const element = body.getChild(i);
      const copy = element.copy();
      const type = element.getType();
      const attributes = element.getAttributes();
      if (type == DocumentApp.ElementType.PARAGRAPH) {
        body.appendParagraph(copy).setAttributes(attributes);  
      } else if (type == DocumentApp.ElementType.TABLE) {
        body.appendTable(copy).setAttributes(attributes);  
      } else if (type == DocumentApp.ElementType.LIST_ITEM) {
        body.appendListItem(copy).setAttributes(attributes);  
      }
    }
  });
  for (let i = numChildren - 1; i >= 0; i--) {
    body.getChild(i).removeFromParent();
  }
}
  • Related