Home > Net >  How can I extract values that have opening and closing brackets with regular expression?
How can I extract values that have opening and closing brackets with regular expression?

Time:10-24

I am trying to extract [[String]] with regular expression. Notice how a bracket opens [ and it needs to close ]. So you would receive the following matches:

  • [[String]]
  • [String]
  • String

If I use \[[^\]] \] it will just find the first closing bracket it comes across without taking into consideration that a new one has opened in between and it needs the second close. Is this at all possible with regular expression?

Note: This type can either be String, [String] or [[String]] so you don't know upfront how many brackets there will be.

CodePudding user response:

You can use the following PCRE compliant regex:

(?=((\[(?:\w  |(?2))*])|\b\w ))

See the regex demo. Details:

  • (?= - start of a positive lookahead (necessary to match overlapping strings):
    • (- start of Capturing group 1 (it will hold the "matches"):
      • (\[(?:\w |(?2))*]) - Group 2 (technical, used for recursing): [, then zero or more occurrences of one or more word chars or the whole Group 2 pattern recursed, and then a ] char
      • | - or
      • \b\w - a word boundary (necessary since all overlapping matches are being searched for) and one or more word chars
    • ) - end of Group 1
  • ) - end of the lookahead.

See the PHP demo:

$s = "[[String]]";
if (preg_match_all('~(?=((\[(?:\w  |(?2))*])|\b\w ))~', $s, $m)){
    print_r($m[1]);
}

Output:

Array
(
    [0] => [[String]]
    [1] => [String]
    [2] => String
)
  • Related