Home > OS >  Parsing a (RNS) polynomial with regex
Parsing a (RNS) polynomial with regex

Time:10-06

I'm trying to write a Regex to parse a polynomial in RNS (residue number system) form:

RNS Polynomial ([Modulus(68719403009), Modulus(68719230977), Modulus(137438822401)]): (67699591241,42814670386,92925202514)x^0   (42539574637,55054036653,135659663247)x^1   (52858091297,11618896202,6855552742)x^2   (45970532823,20845087073,91272562929)x^3   (11148839321,55275439733,5401722690)x^4   (31959765643,40620395732,93052536121)x^5   (57030732406,66026147059,6304524013)x^6   (27778918692,11276356856,61606736382)x^7

I wrote:

\(\[(Modulus\(\d \)(,*)\s*) \]\):(\s*\((\d*,*) \)x\^\d\s*\ *) 

but I'm only able to parse it entirely. I cannot have a group for the Modulus, neither the coefficients (a,b,c,...)x^n. What is wrong?

Here's the regex101: https://regex101.com/r/itBQHJ/1

CodePudding user response:

Your pattern has superfluous capture groups, and repeats 2 capture groups which will capture the value of the last iteration.

Also this \ * and ,* will allow trailing plus signs and comma's.


One option to is to match the data format of the example string using 2 capture groups and an unrolled form of the pattern, optionally repeating the same pattern preceding with , or to prevent allowing trailing and ,

In the code, you can check for the group 1 and group 2 value and use split to get the separate parts.

The pattern is slightly longer

\(\[(Modulus\(\d \)(?:\s*,\s*Modulus\(\d \))*)\]\):(\(\d (?:,\d )*\)x\^\d (?:\s*\ \s*\(\d (?:,\d )*\)x\^\d )*)

See a regex demo or a Java demo

An example in Java

String regex = "\\(\\[(Modulus\\(\\d \\)(?:\\s*,\\s*Modulus\\(\\d \\))*)\\]\\):(\\(\\d (?:,\\d )*\\)x\\^\\d (?:\\s*\\ \\s*\\(\\d (?:,\\d )*\\)x\\^\\d )*)";
String string = "RNS Polynomial ([Modulus(68719403009), Modulus(68719230977), Modulus(137438822401)]):(67699591241,42814670386,92925202514)x^0   (42539574637,55054036653,135659663247)x^1   (52858091297,11618896202,6855552742)x^2   (45970532823,20845087073,91272562929)x^3   (11148839321,55275439733,5401722690)x^4   (31959765643,40620395732,93052536121)x^5   (57030732406,66026147059,6304524013)x^6   (27778918692,11276356856,61606736382)x^7\n";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    if (matcher.group(1) != null) {
        for (String elm1 : matcher.group(1).split(",\\s*"))
            System.out.println(elm1);
    }
    if (matcher.group(2) != null) {
        for (String elm2 : matcher.group(2).split("\\s*\\ \\s*"))
            System.out.println(elm2);
    }
}

Output

Modulus(68719403009)
Modulus(68719230977)
Modulus(137438822401)
(67699591241,42814670386,92925202514)x^0
(42539574637,55054036653,135659663247)x^1
(52858091297,11618896202,6855552742)x^2
(45970532823,20845087073,91272562929)x^3
(11148839321,55275439733,5401722690)x^4
(31959765643,40620395732,93052536121)x^5
(57030732406,66026147059,6304524013)x^6
(27778918692,11276356856,61606736382)x^7

Another pattern could be using an alternation | to match both specific parts of the string, but it will not validate the whole data structure.

\b(Modulus\(\d \))|(\(\d (?:,\d )*\)x\^\d )\b

Regex demo

  • Related