Given this string of codes in a compressed form:
GMZDCOR[R[G[A5D[C,E,K,M,O],E5D[K,M,O],I5D[C,E,K,M,O]],HI[4Q,YT[A,C]]],SHI[2[A,Q],3[A,Q],4A,Y[Q,T[G,I]],Z[A,Q]],THI[2A,3Q,4[A,Q],Y[Q,TA],Z[A,Q]],UHI[YQ,ZA],VHI[2A,3Q,4[A,Q],Y[Q,T[A,C]],Z[A,Q]],WHI[3Q,4A,Y[Q,T[C,E]],ZA],XHI[2[A,Q],ZQ],YHI[2[A,Q],YQ,Z[A,Q]],ZHI[2Q,3[A,Q],YQ,ZA]]
I need it to look like this:
[GMZDCORRGA5DC, GMZDCORRGA5DE, GMZDCORRGA5DK, GMZDCORRGA5DM, ...etc]
Is there a known algorithm for solving this kind of problem? I've tried basic splitting but looks like I need to keep track of all the brackets there.
CodePudding user response:
You can use a regular expression to tokenise the input:
(\w )([\[]?)
to match some alphanumerical string followed by an optional opening bracket, or\]
: to match a closing bracket
Commas don't need to be captured -- they just function as delimiters.
Then use recursion to generate the results for nested expressions.
In JavaScript, this is a good candidate for writing a generator function which returns an iterator over the results:
function parse(s) {
const tokens = s.matchAll(/(\w )([\[]?)|(\])|$/g);
function* generate(isNested) {
while (true) {
const {value: [all, word, opening, closing]} = tokens.next();
if (isNested ? closing : !all) break;
if (opening) {
for (const substr of generate(true)) {
yield word substr;
}
} else {
yield word;
}
}
}
return Array.from(generate());
}
const s = "GMZDCOR[R[G[A5D[C,E,K,M,O],E5D[K,M,O],I5D[C,E,K,M,O]],HI[4Q,YT[A,C]]],SHI[2[A,Q],3[A,Q],4A,Y[Q,T[G,I]],Z[A,Q]],THI[2A,3Q,4[A,Q],Y[Q,TA],Z[A,Q]],UHI[YQ,ZA],VHI[2A,3Q,4[A,Q],Y[Q,T[A,C]],Z[A,Q]],WHI[3Q,4A,Y[Q,T[C,E]],ZA],XHI[2[A,Q],ZQ],YHI[2[A,Q],YQ,Z[A,Q]],ZHI[2Q,3[A,Q],YQ,ZA]]";
console.log(parse(s));