Home > other >  Need to extract String in between strings using REGEXEXTRACT() in Google Sheet
Need to extract String in between strings using REGEXEXTRACT() in Google Sheet

Time:11-18

I have a string like this patern: "abcd abcd | abcde | Degree SP | xyz abcd | abcd ABC"

I need to extract "Degree SP" using regular expressions. How can I do that? The condition here are:

  • string end with "SP "
  • string start after last "|".

I'm trying the Google Sheet formula REGEXEXTRACT(<input string>, "[\|\s]. SR[\s\|]") It returns " | abcde | Degree SP ". How can I restrict to extract from the last "|"?

CodePudding user response:

If the string Degree SP should be between pipes and a space:

\|\s([^\s|][^|]*SP)\s\|
  • \|\s Match | and a whitespace char
  • ( Capture group 1
    • [^\s|] Match a single char other than a space or |
    • [^|]*SP Match optional chars other than | and match SP
  • ) Close group 1
  • \s\| Match a whitespace char and |

enter image description here

If only the pipe after Degree SP is mandatory:

([^\s|][^|]*SP)\s*\|

enter image description here

CodePudding user response:

With your shown samples, please try following regex.

^.*?\s \S \s \|\s \S \s \|\s ([^\\|]*)\s \|.*$

Online demo for above regex

OR you want to catch value between 2nd and 3rd occurrence of | which ends with SP string then try following regex:

^.*?\s \S \s \|\s \S \s \|\s ([^\\|]*SP)\s \|.*$

Online demo for above regex

Explanation: Adding detailed explanation for above.

^.*?\s \S \s   ##Matching from starting of value with a lazy match till 1st occurrence of spaces followed by 1 or more non-spaces followed by 1 or more spaces.
\|\s \S \s \|  ##Matching |(literal) followed by spaces followed by 1 or more non-spaces followed by spaces with |(literal character) here.
\s             ##Matching 1 or more spaces occurrences here.
([^\\|]*)      ##Creating 1 and only capturing group which has everything till next occurrence of | to get Degree SP value mentioned by OP in samples.
\s \|.*$       ##Matching 1 or spaces followed by | till last of value/line.
  • Related