Home > Net >  How to Keep rows of multi-line cells containing a keyword in google sheets
How to Keep rows of multi-line cells containing a keyword in google sheets

Time:04-05

I'm trying to keep lines that contain the word "NOA" in a column A which has many multi-line cells as can be viewed in this enter image description here

Theoretical Approaches: I have been thinking about three approaches to solve this:

  1. ARRAYFORMULA(REGEXREPLACE - couldn't get it to work
  2. JOIN(FILTER(REGEXMATCH(TRANSPOSE - showing promise as it works in multiple steps
  3. Using the QUERY Function - unfamiliar w/ function but wondering if this function has a fast solution

Practical attempts:

FIRST APPROACH: first I attempted using REGEXEXTRACT to extract out everything that did not have NOA in it, the Regex worked in demo but didn't work properly in sheets. I thought this might be a concise way to get the value, perhaps if my REGEX skill was better?

ARRAYFORMULA(REGEXREPLACE(A1:A7, "^(?:[^N\n]|N(?:[^O\n]|O(?:[^A\n]|$)|$)|$) ",""))

I think the Regex because overly complex, didn't work in Google or perhaps the formula could be improved, but because Google RE2 has limitations it makes it harder to do certain things.

SECOND APPROACH:

Then I came up with an alternate approach which seems to work 2 stages (with multiple helper cells) but I would like to do this with one equation.

=TRANSPOSE(split(A2,CHAR(10)))
=TEXTJOIN(CHAR(10),1,FILTER(C2:C7,REGEXMATCH(C2:C7,"NOA")))

Questions:

  1. Can these formulas be combined and applied to the entire Column using an Index or Array?
  2. Or perhaps, the REGEX in my first approach can be modified?
  3. Is there a faster solution using Query?

The shared Google spreadhseet is here.

Thank you in advance for your help.

CodePudding user response:

Here's one way you can do that:

=index(substitute(substitute(transpose(trim(
query(substitute(transpose(if(regexmatch(split(
filter(A2:A,A2:A<>""),char(10)),"NOA"),split(
filter(A2:A,A2:A<>""),char(10)),))," ","❄️")
,,9^9)))," ",char(10)),"❄️"," "))

First, we split the data by the newline (char 10), then we filter out the lines that don't contain NOA and finally we use a "query smush" to join everything back together.

  • Related