Home > other >  Given this compressed string, parse it and return the decompressed form in an array
Given this compressed string, parse it and return the decompressed form in an array

Time:12-13

Given this string of codes in a compressed form:

GMZDCOR[R[G[A5D[C,E,K,M,O],E5D[K,M,O],I5D[C,E,K,M,O]],HI[4Q,YT[A,C]]],SHI[2[A,Q],3[A,Q],4A,Y[Q,T[G,I]],Z[A,Q]],THI[2A,3Q,4[A,Q],Y[Q,TA],Z[A,Q]],UHI[YQ,ZA],VHI[2A,3Q,4[A,Q],Y[Q,T[A,C]],Z[A,Q]],WHI[3Q,4A,Y[Q,T[C,E]],ZA],XHI[2[A,Q],ZQ],YHI[2[A,Q],YQ,Z[A,Q]],ZHI[2Q,3[A,Q],YQ,ZA]]

I need it to look like this: [GMZDCORRGA5DC, GMZDCORRGA5DE, GMZDCORRGA5DK, GMZDCORRGA5DM, ...etc]

Is there a known algorithm for solving this kind of problem? I've tried basic splitting but looks like I need to keep track of all the brackets there.

CodePudding user response:

You can use a regular expression to tokenise the input:

  • (\w )([\[]?) to match some alphanumerical string followed by an optional opening bracket, or
  • \]: to match a closing bracket

Commas don't need to be captured -- they just function as delimiters.

Then use recursion to generate the results for nested expressions.

In JavaScript, this is a good candidate for writing a generator function which returns an iterator over the results:

function parse(s) {
    const tokens = s.matchAll(/(\w )([\[]?)|(\])|$/g);
    
    function* generate(isNested) {
        while (true) {
            const {value: [all, word, opening, closing]} = tokens.next();
            if (isNested ? closing : !all) break;
            if (opening) {
                for (const substr of generate(true)) {
                    yield word   substr;
                }
            } else {
                yield word;
            }
        }
    }
    return Array.from(generate());
}


const s = "GMZDCOR[R[G[A5D[C,E,K,M,O],E5D[K,M,O],I5D[C,E,K,M,O]],HI[4Q,YT[A,C]]],SHI[2[A,Q],3[A,Q],4A,Y[Q,T[G,I]],Z[A,Q]],THI[2A,3Q,4[A,Q],Y[Q,TA],Z[A,Q]],UHI[YQ,ZA],VHI[2A,3Q,4[A,Q],Y[Q,T[A,C]],Z[A,Q]],WHI[3Q,4A,Y[Q,T[C,E]],ZA],XHI[2[A,Q],ZQ],YHI[2[A,Q],YQ,Z[A,Q]],ZHI[2Q,3[A,Q],YQ,ZA]]";
console.log(parse(s));

  • Related