I'm trying to write a Regex to parse a polynomial in RNS (residue number system) form:
RNS Polynomial ([Modulus(68719403009), Modulus(68719230977), Modulus(137438822401)]): (67699591241,42814670386,92925202514)x^0 (42539574637,55054036653,135659663247)x^1 (52858091297,11618896202,6855552742)x^2 (45970532823,20845087073,91272562929)x^3 (11148839321,55275439733,5401722690)x^4 (31959765643,40620395732,93052536121)x^5 (57030732406,66026147059,6304524013)x^6 (27778918692,11276356856,61606736382)x^7
I wrote:
\(\[(Modulus\(\d \)(,*)\s*) \]\):(\s*\((\d*,*) \)x\^\d\s*\ *)
but I'm only able to parse it entirely. I cannot have a group for the Modulus
, neither the coefficients (a,b,c,...)x^n
. What is wrong?
Here's the regex101: https://regex101.com/r/itBQHJ/1
CodePudding user response:
Your pattern has superfluous capture groups, and repeats 2 capture groups which will capture the value of the last iteration.
Also this \ *
and ,*
will allow trailing plus signs and comma's.
One option to is to match the data format of the example string using 2 capture groups and an unrolled form of the pattern, optionally repeating the same pattern preceding with ,
or
to prevent allowing trailing
and ,
In the code, you can check for the group 1 and group 2 value and use split to get the separate parts.
The pattern is slightly longer
\(\[(Modulus\(\d \)(?:\s*,\s*Modulus\(\d \))*)\]\):(\(\d (?:,\d )*\)x\^\d (?:\s*\ \s*\(\d (?:,\d )*\)x\^\d )*)
See a regex demo or a Java demo
An example in Java
String regex = "\\(\\[(Modulus\\(\\d \\)(?:\\s*,\\s*Modulus\\(\\d \\))*)\\]\\):(\\(\\d (?:,\\d )*\\)x\\^\\d (?:\\s*\\ \\s*\\(\\d (?:,\\d )*\\)x\\^\\d )*)";
String string = "RNS Polynomial ([Modulus(68719403009), Modulus(68719230977), Modulus(137438822401)]):(67699591241,42814670386,92925202514)x^0 (42539574637,55054036653,135659663247)x^1 (52858091297,11618896202,6855552742)x^2 (45970532823,20845087073,91272562929)x^3 (11148839321,55275439733,5401722690)x^4 (31959765643,40620395732,93052536121)x^5 (57030732406,66026147059,6304524013)x^6 (27778918692,11276356856,61606736382)x^7\n";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
if (matcher.group(1) != null) {
for (String elm1 : matcher.group(1).split(",\\s*"))
System.out.println(elm1);
}
if (matcher.group(2) != null) {
for (String elm2 : matcher.group(2).split("\\s*\\ \\s*"))
System.out.println(elm2);
}
}
Output
Modulus(68719403009)
Modulus(68719230977)
Modulus(137438822401)
(67699591241,42814670386,92925202514)x^0
(42539574637,55054036653,135659663247)x^1
(52858091297,11618896202,6855552742)x^2
(45970532823,20845087073,91272562929)x^3
(11148839321,55275439733,5401722690)x^4
(31959765643,40620395732,93052536121)x^5
(57030732406,66026147059,6304524013)x^6
(27778918692,11276356856,61606736382)x^7
Another pattern could be using an alternation |
to match both specific parts of the string, but it will not validate the whole data structure.
\b(Modulus\(\d \))|(\(\d (?:,\d )*\)x\^\d )\b