Home > Back-end >  Removing everything between 2 strings with Google sheets RE2
Removing everything between 2 strings with Google sheets RE2

Time:11-03

I'm trying to remove something from a product title as part of a Google sheet

  • Example Johner Gladstone Pinot Noir 2015, 75CL
  • Stella Artois Premium Lager Bottle, 1 X 660 Ml
  • Pepesza Ppsh-40 Vodka Tommy Gun, 1 L

And I want to be able to remove everything from the , and either the CL, ML or L.

The problem I'm running into is that I don't know enough about regex and I'm struggling to find a good place to learn!

What I've tried so far is below

  • =REGEXREPLACE(A2,"[, ]\QML|CL\E","")

    but this doesn't work and I think its because [, ] isn't a valid part.

  • =REGEXREPLACE(A2,"\*\QML|CL\E","")

    because I know that , is the only punctuation in the titles - I've also tried this but not been successful.

CodePudding user response:

What you are trying to get is

(?i), .*?[CM]?L

See the regex demo. Details:

  • (?i) - case insensitive flag
  • , .*? - comma, space, and then any zero or more chars other than line break chars, as few as possible (due to *?, if you need as many as possible use * instead)
  • [CM]?L - C or M (optionally due to ?) and then an L char.

However, you can simply match from a , space till the end of the line:

", .*

See this regex demo. Here, the first comma space is matched and then the rest of the string (line, since . does not match line breaks by default).

See the regular expression syntax accepted by RE2.

  • Related