Home > database >  How to match text between as key-value pairs
How to match text between as key-value pairs

Time:08-26

I need a RegExp that will evaluate a CSS code; to match a selector, display name and properties. I’m not too familiar with regular expressions, so I’ve this expression that can match a selector -if exists- and display name: /(?'selector'\#|\.?)(?'display'[a-zA-Z][a-zA-Z]*)\s*(?=\{)(.*?)(?=\})/gm.

I’ve managed to match selector and display group, what I can’t do is matching properties. For example in the following string, I want to have a group inside the match, that will return background: red and another group that will return color: green.

I can match the content between delimiters ({ and }) with lookahead and lookbehind, but I couldn’t manage to create groups in my match to extract my properties.

I’m testing these on this string: #id { background: red; color: green; }

To be clear, my desired result on this string is:

  1. Selector (working)
  2. Name (working)
  3. First property (in this case, this value should be background: red)
  4. Second property (in this case, this value should be color: green)
  5. Other properties if they exist

CodePudding user response:

Rather than trying to decompose the whole CSS rule in a single complex regex, I would write a simple parser that uses simpler regexes.

A CSS rule consists of clearly defined separate parts:
A selector[1], an open brace '{', some declarations, and a close brace '}'

<selector> '{'
    <declaration>';'*
'}'

A declaration is a property, a colon ':', the value, then a semi-colon ';'

Using these key characters {:;} we can use three simple regexes to split the rule into its parts.
The first regex looks a bit complicated at first glance because it accounts for optional whitespace, so uses \s* in several places:

/^\s*(?<selector>[^{] )\s*{\s*(?<declarations>. )\s*}/ms

but it's basically only looking for and separating the selector and some declarations that are enclosed in braces {...}
Note it uses multi-line search m and also uses the "dotAll" option s so it can match CSS rules as they are usually written, on multiple lines.

Once it's got those it uses simple string .splits to break down each declaration. Declarations end in semi-colons so it will do a .split(/\s*;\s*/) resulting in an array of declarations. Each declaration is the property:value pair, so that will split on colon : and de-structure the resulting array: [property, value] = declaration.split(/\s*:\s*/)

I create a function using those steps which takes the CSS rule as a string and returns an object representing the selector and all of the declarations.

const ruleA = '#id { background:red; color: green; }';
const ruleB = `#id {
    background: red;
    color:green;
}`;

let parsedRule = parseRule(ruleB);
console.log('parsedRule:', parsedRule);
  
function parseRule(rule) {
    const re_rule =
        /^\s*(?<selector>[^{] )\s*{\s*(?<declarations>. )\s*}/ms;
    const base = re_rule.exec(rule);
    const selector = base.groups.selector.trim();
    const declarations = base.groups.declarations.trim();

    const result = {
        'selector' : selector,
        'declarations' : []
    };

    declarations.split(/\s*;\s*/)
        .forEach(d => {
            [property, value] = d.split(/\s*:\s*/);
            //console.log('prop length:', property.length);
            //console.log(`property: '${property}'`);
            //console.log(`value:' '${value}'`);
            if (property?.trim().length > 0) {
                result.declarations.push({
                    "prop" : property,
                    "value" : value
                });
            }
        });

    return result;
}

[1] "a selector" could, of course, be multiple comma-separated selectors, and that could be handled, but I'm omitting it to keep this example simpler and, hopefully, clearer.

  • Related