Home > front end >  RegEx matching only within a match / restrict matching to part of string
RegEx matching only within a match / restrict matching to part of string

Time:07-03

Is there a way to use a single regular-expression to match only within another math. For example, if I want to remove spaces from a string, but only within parentheses:

source : "foobar baz blah (some sample text in here) and some more"

desired: "foobar baz blah (somesampletextinhere) and some more"

In other words, is it possible to restrict matching to a specific part of the string?

CodePudding user response:

One idea is to replace any space between parentheses using a lookahead pattern:

 (?=([^\s\(]  )*\S*\))` 

The lookahead will attempt to match the last space before the closed parenthesis (\S*\)) and any optional space before ([^\s\(] )* (if found).

Detailed Regex Explanation:

  • : space
  • (?=([^\s\(] )*\S*\)): lookahead non-capturing group
    • ([^\s\(] )*: any combination characters not including the open parenthesis and the space characters space (this group is optional)
    • \S*\): any non-space character closed parenthesis

Check the demo here.

CodePudding user response:

In PCRE a combination of \G and \K can be used:

(?:\G(?!^)|\()[^)\s]*\K\s 
  • \G continues where the previous match ended
  • \K resets beginning of the reported match
  • [^)\s] matches any character not in the set

See demo at regex101

The idea is to chain matches to an opening parentheses. The chain-links are either [^)\s]* or \s . To only get spaces \K is used to reset before. This solution does not require a closing ).


In other regex flavors that support \G but not \K, capturing groups can help out. Eg Search for

(\G(?!^)|\()([^)\s]*)\s 

and replace with captures of the 2 groups (depending on lang: $1$2 or \1\2) - Regex101 demo


Note: Another PCRE feature for skipping over certain parts is (*SKIP)(*F). It is often used together with The Trick. The idea is simple: skip this(*SKIP)(*F)|match that - Regex101 demo

  • Related