At the moment I have an Excel sheet with a column holding data in this form:
E 1-6,44-80
E 10-76
E 44-80,233-425
E 19-55,62-83,86-119,200-390
...
I need to be able to capture each range of numbers individually. For example, I would like the first line above to result in "1-6" and "44-80" being captured into their own groups. So, essentially I need to capture a repeating group.
When trying to use this pattern, which uses the general form for capturing repeating groups given by @ssent1 on this question:
E\s(([0-9]{1,4})-([0-9]{1,4}))((?:,([0-9]{1,4})-([0-9]{1,4}))*)
I end up only matching the first and last number ranges. I understand that this is because I'm repeating captured groups rather than capturing a repeating group, but I can't figure out how to correct my pattern. Any help would be greatly appreciated.
CodePudding user response:
In Java you can make use of a capture group and the \G
anchor to get continuous matches:
(?:^E\h |\G(?!^),?(\d{1,4}-\d{1,4}))
Example
String regex = "(?:^E\\h |\\G(?!^),?(\\d{1,4}-\\d{1,4}))";
String string = "E 1-6,44-80\n"
"E 10-76\n"
"E 44-80,233-425\n"
"E 19-55,62-83,86-119,200-390\n"
"200-390";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
if (matcher.group(1) != null) {
System.out.println(matcher.group(1));
}
}
Output
1-6
44-80
10-76
44-80
233-425
19-55
62-83
86-119
200-390