Home > database >  Extract Text Sections from a Cell
Extract Text Sections from a Cell

Time:03-15

Given a cell with text pulled from HTML in a format like this:

OA-1

Interpret products of whole numbers, e.g., interpret 5 × 7 as the total number of objects in 5 groups of 7 objects each. For example, describe a context in which a total number...

  • More

OA-2

Interpret whole-number quotients of whole numbers, e.g., interpret 56 ÷ 8 as the number of objects in each share when 56 objects are partitioned equally into 8 shares, or as a number ...

Goal: extract a list of header identifiers so that output looks like this: OA-1,OA-2...

I've pulled data via the =importhtml function as shown in the two examples on this MWE sheet.

Noting that char(10) is a return character, I'm thinking of code something like this psuedocode:

Left(Cell_with_text,number_of_characters = find(first char(10))-1)&","&"find_next_heade" \r

Another approach might me to make a library of all the headers (eg, "OA-1,OA-2...") and somehow find each instance in the cell, maybe with a find function in an array?

Assumptions

  • The headers could have between 3 and 7 characters.

  • Headers do not always start with the same letter.

  • Headers do always have a dash, but it could be anywhere from the 2nd to the 2nd to last position.

  • There is always a char(10) after each header.

CodePudding user response:

This formula Splits all of them at once, then just keeps the 1st column (which is your desired output). Then do the JOIN().

=JOIN(", ",INDEX(SPLIT(importhtml("https://contentexplorer.smarterbalanced.org/target/m-g3-c1-ta","list",3),CHAR(10)),,1))

Here is a sample sheet, viewable to all in perpetuity.

  • Related