Given a cell with text pulled from HTML in a format like this:
OA-1
Interpret products of whole numbers, e.g., interpret 5 × 7 as the total number of objects in 5 groups of 7 objects each. For example, describe a context in which a total number...
- More
OA-2
Interpret whole-number quotients of whole numbers, e.g., interpret 56 ÷ 8 as the number of objects in each share when 56 objects are partitioned equally into 8 shares, or as a number ...
Goal: extract a list of header identifiers so that output looks like this: OA-1,OA-2...
I've pulled data via the =importhtml
function as shown in the two examples on this MWE sheet.
Noting that char(10)
is a return character, I'm thinking of code something like this
psuedocode:
Left(Cell_with_text,number_of_characters = find(first char(10))-1)&","&"find_next_heade" \r
Another approach might me to make a library of all the headers (eg, "OA-1,OA-2...") and somehow find each instance in the cell, maybe with a find function in an array?
Assumptions
The headers could have between 3 and 7 characters.
Headers do not always start with the same letter.
Headers do always have a dash, but it could be anywhere from the 2nd to the 2nd to last position.
There is always a char(10) after each header.
CodePudding user response:
This formula Splits all of them at once, then just keeps the 1st column (which is your desired output). Then do the JOIN().
=JOIN(", ",INDEX(SPLIT(importhtml("https://contentexplorer.smarterbalanced.org/target/m-g3-c1-ta","list",3),CHAR(10)),,1))