Home > Net >  Regex With Nested Square brackets
Regex With Nested Square brackets

Time:09-30

I have such string ('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> ''), I want to use regex to get the items below:

  • Test.A[0]
  • Test.B[0]
  • Test.C[0]
  • Test.D[0]

I tried like \[.*?\], but it will return with [Test.EVAPCT[0].

CodePudding user response:

You can use

\w (?:\.\w )*\[[^\][]*]
\w (?:\.\w )*\[\d ]

See the regex demo. Details:

  • \w - one or more word chars
  • (?:\.\w )* - zero or more sequences of a . and one or more word chars
  • \[ - a [ char
  • [^\][]* - zero or more chars other than [ and ] / \d - one or more digits
  • ] - a ] char.

See a demo below:

const text = "('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '')";
const regex = /\w (?:\.\w )*\[[^\][]*]/g;
console.log( text.match(regex) );

To also cater for cases like [Test.F] you may use a regex following a bit different logic:

/(?<=\[)\w (?:\.\w )*(?:\[[^\][]*])?(?=])/g

See this regex demo and the demo below:

const text = "('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '') [Test.F]";
const regex = /(?<=\[)\w (?:\.\w )*(?:\[[^\][]*])?(?=])/g;
console.log( text.match(regex) );

Details:

  • (?<=\[) - a location right after a [ char
  • (\w (?:\.\w )*(?:\[[^\][]*])?) - Group 1: one or more word chars, and then zero or more sequences of . and one or more word chars, and then an optional occurrence of a [...] substring
  • (?=]) - a location right before a ] char.

CodePudding user response:

Using \[.*?\] starts the match with [ and matches till the first occurrence of ] where .*? can also match [ and therefore matches too much.

You could match the digits between the square brackets to make it a bit more specific:

[^\][] \[\d \]

The pattern matches

  • [^\][] Match any char except the square brackets using a negated character class
  • \[\d \] Match 1 digits between the square brackets

Regex demo

A bit more broader variant could be matching optional chars other than [ ] or a whitspace char before the square bracket.

[^\s\][()']*\[[^\s\][] \]

The pattern matches:

  • [^\s\][()']* Optionally match chars other than the listed in the character class
  • \[ Match [
  • [^\s\][] Match 1 chars other than [ ] or a whitespace char
  • \] Match the closing ]

Regex demo

const str = `('[Test.F]' <>'' OR '[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '')`;
const regex = /[^\s\][()']*\[[^\s\][] \]/g;
console.log(str.match(regex));

Matching Test.F instead of [Test.F] using a capture group:

\[([^\][]*(?:\[[^\][]*])?)]

Regex demo

const str = `('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '') [Test.F]`;
const regex = /\[([^\][]*(?:\[[^\][]*])?)]/g;
console.log(Array.from(str.matchAll(regex), m => m[1]));

  • Related