Home > other >  Find contents between square brackets and quotations
Find contents between square brackets and quotations

Time:02-06

So to put it straight, lets say I have this string:

command [stuff] [stuff [inside] this] "string" "another [thing] string"

Inside of my code I want to grab all the things with quotation marks and put them in an array and grab all the things inside of outer most brackets (everything inside of the outside brackets) and put them in their own array. Like so:

const string = `command [stuff] [stuff [inside] this] "string" "another [thing] string"`;

let quotations = ["string", "another [thing] string"]
let brackets = ["stuff", "stuff [inside] this"] // I do not want to include any brackets found inside of quotation marks

I have tried to make a regex that would do this, but I am just having a lot of trouble understanding how I would set it up. I did find these two regex which find the stuff in quotations and brackets but they aren't 100% what I am looking for:

// JavaScript Regex
const regexStrings = /(["'])(?:(?=(\\?))\2.)*?\1/g;
const regexBrackets = /\[(.*?)\]/g;

CodePudding user response:

There is no support for recursion in JavaScript's regex syntax, so you'll need to throw in some code in order to cope with an arbitrary depth of bracket nesting.

I would therefore go for splitting the string into:

  • quotations (starting and ending with an quotation mark, taking into account backslash escaping)
  • A substring that does not have any of '"[] characters
  • A single [ or ]

Then use a depth counter to keep track how deeply the brackets are so you know when to build a bracket substring by concatenating the tokens along the way.

Here is a snippet, using a bit more complex input string than you provided:

function solve(str) {
    let tokens = str.match(/(['"])((\\.|.)*?)\1|[^[\]'"] |./g);
    let brackets = [];
    let quotations = [];
    let bracket = "";
    let depth = 0;
    for (let token of tokens) {
        if (token[0] === '"' || token[0] === "'") {
            quotations.push(token.slice(1, -1));
        } else if (token === "[") {
            depth  ;
        } else if (token === "]") {
            depth--;
            if (depth < 0) throw "Unbalanced brackets";
            if (!depth) {
                brackets.push(bracket.slice(1));
                bracket = "";
            }
        }
        if (depth) bracket  = token;
    }
    if (depth) throw "Unbalanced brackets";
    return {quotations, brackets};
}


const string = String.raw`command [stuff] [stuff [inside [very inside with "escaped \" bracket:]" ]] this] "string" "another [thing] string"`;

console.log(solve(string));

CodePudding user response:

Here's an attempt.

The regex for the quotes will try to find non-quotes between quotes.

The regex for the brackets will first try to match non-quotes between quotes, and then the stuff between brackets.
Then filters out the matches that start with a quote.

There's no recursion, so it's only 1 level of optional brackets within brackets.

const string = `command [stuff] [stuff [inside] this] "string" "another [thing] string"`;

// JavaScript Regex
const regexStrings = /"[^"]*"|'[^']'/g;
const regexBrackets = /"[^"]*"|'[^']'|(\[[^\[\]]*(?:\[[^\[\]]*\])?[^\[\]]*\])/g;

let quotations = string.match(regexStrings)
                       .map(x=>x.replace(/^["']|["']$/g,''));
let brackets = string.match(regexBrackets).filter(x=>!/^["']/.test(x));

console.log(quotations);
console.log(brackets);

  •  Tags:  
  • Related