Home > Mobile >  How to test if a RegExp contains capturing groups in its definition?
How to test if a RegExp contains capturing groups in its definition?

Time:10-18

I'm trying to find two RegExp:

  1. One to test whether another RegExp contains capturing groups, like in /ab(c)d/
  2. Same as 1., but only detecting named capturing groups, like /ab(?<name>c)d/

These "meta"-regexes would check the source property of a regex.

Here is my best attempt for 1: /(?<!\\)\((?!\?:)/. The idea is to look for an opening parenthesis not preceded by \ and not followed by ?: (which would make it a non-capturing group). But this has false positives (/c[z(a]d/ for example), and false negatives /a\\(b)/.

My attempt at 2. follows the same logic and has thus the same flaws: /(?<!\\)\(\?<(?![=!])/

Any idea on how to do this properly? Thank you.

CodePudding user response:

You could use a regex that besides spotting capture groups, also captures escape pairs (\\.) and character classes \[(?:\\.|.)*?\] (also aware of escape characters), so to avoid false positives/negatives. Then loop over the matches to spot the good matches.

The below snippet returns the number of anonymous capture groups and the names of the named capture groups:

const reParser = /\\.|\[(?:\\.|.)*?\]|(\()(?!\?)|\(\?<([^=!][^>]*)/g;
function captureGroups(regex) {
    const names = [];
    let numAnonymous = 0;
    for (const [match, anon, name] of regex.source.matchAll(reParser)) {
        if (name) names.push(name);
        else if (anon) numAnonymous  ;
    }
    return { numAnonymous, names };
}

// Example run
console.log(captureGroups(/test[12\](3]*(?<xy>((\.))?)/g));

If you only need to know the fact whether there is a capture group, then you could first remove those escape pairs and character classes from the regex and replace them with a single character. Then remains to recognise the capture group pattern:

function hasCaptureGroups(regex) {
    const simpler = regex.source.replace(/\\.|\[(?:\\.|.)*?\]/g, "x");
    return {
        hasAnonymous: /\([^?]/.test(simpler),
        hasNamed: /\(\?</.test(simpler)
    };
}

// Example run
console.log(hasCaptureGroups(/test[12\](3]*(?<xy>((\.))?)/g));

To get this done with just a regular expression and no replacement, you need to focus on matching an input that does not have the capture group, and then negate that -- that can be done with a negative look-ahead at the very first position, scanning the complete input:

const reAnonymousGroup = /^(?!(\\.|\[(?:\\.|.)*?\]|[^(]|\(\?)*$)/;
const reNamedGroup     = /^(?!(\\.|\[(?:\\.|.)*?\]|[^(]|\([^?]|\(\?[^<])*$)/;

// Example run
const regex = /test[12\](3]*(?<xy>((\.))?)/g;
console.log("has anonymous group:", reAnonymousGroup.test(regex.source));
console.log("has named group:", reNamedGroup.test(regex.source));

  • Related