How to extract only the last two lines of text from a string?-CodePudding

A1:

88'



21



Corinthians
SE Palmeiras

A2:

90'
 1


03

Club Sporting Canamy
Reboceros de La Piedad

Current formula in B1:

=ARRAYFORMULA(TRANSPOSE(QUERY(QUERY(TRANSPOSE(SPLIT(TRIM(A1:A),char(10))),,2),"offset 1")))

My expected result is to collect the last two lines of text that are the two names, leaving the first name in one Column and the second name in another Column, like this:

COLUMN B                   COLUMN C

Corinthians                SE Palmeiras
Club Sporting Canamy       Reboceros de La Piedad

The problem is that depending on the number of lines that have text in each of the cells, the result for headers and the removal of the QUERY offset ends up being different, what formula can I use to always collect the last two lines of text in the string?

CodePudding user response：

Try this formula:

=split(substitute(mid(substitute(A1,char(10),char(9670),(LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),"")) 1*COUNTA(A1) -2)),(search(char(9670),substitute(A1,char(10),char(9670),(LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),"")) 1*COUNTA(A1) -2))) 1),(len(A1)-search(char(9670),substitute(A1,char(10),char(9670),(LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),"")) 1*COUNTA(A1)-2))))),char(10),","),",")

Logic

Count the number of line feeds in a given cell
LEN(A1)-LEN(SUBSTITUTE(A1,CHAR(10),"")) 1*COUNTA(A1)
Substitute a unique character in place of the 2nd last line feed
substitute(A1,char(10),char(9670),<line feed count>-2)
Get the position of the unique character
search(char(9670),<substituted text>)
Get the length of the original cell
len(A1)
Extract the text between the unique character and the end of the string
mid()
Substitute a comma for the remaining line feed
substitute(<the text>,char(10),",")
Split the result by comma
split(<the result>,",")

CodePudding user response：

you can use the formula REGEXEXTRACT() to perform this task. The regular expression "(.*\n.*)$" allows you to select a line of any characters .* a line break \n another line of any characters .* from the end $.

As a result, we have a cell with two lines inside. Split it into two columns with SPLIT() formula.

Now we wrap everything in an array formula with a check for empty rows. If original row is empty, we skip it

=ARRAYFORMULA((if(A1:A<>"",split(REGEXEXTRACT(A1:A,"(.*\n.*)$"),char(10)),)))