String s = #Section250342,Main,First/HS/12345/Jack/M,2000 10.00,
#Section250322,Main,First/HS/12345/Aaron/N,2000 17.00,
#Section250399,Main,First/HS/12345/Jimmy/N,2000 12.00,
#Section251234,Main,First/HS/12345/Jack/M,2000 11.00
Wherever there is the word /Jack/M in the3 string, I want to pull the section numbers(250342,251234) and the values(10.00,11.00) associated with it using regex each time.
I tried something like this https://regex101.com/r/4te0Lg/1 but it is still messed.
.Section(\d (?:\.\d )?).*/Jack/M
CodePudding user response:
You could use 2 capture groups, and use a tempered greedy token approach to not cross @Section
followed by a digit.
#Section(\d )(?:(?!#Section\d).)*\bJack/M,\d \h (\d (?:\.\d )?)\b
Explanation
#Section(\d )
Match #Section and capture 1 digits in group 1(?:(?!#Section\d).)*
Match any character if not directly followed by #Section and a digit\bJack/M,
Match the word Jack and/M,
\d \h
Match 1 digits and 1 spaces(\d (?:\.\d )?)
Capture group 2, match 1 digits and an optional decimal part\b
A word boundary
In Java:
String regex = "#Section(\\d )(?:(?!#Section\\d).)*\\bJack/M,\\d \\h (\\d (?:\\.\\d )?)\\b";
CodePudding user response:
If the only parts of each section that change are the section number, the name of the person and the last value (like in your example) then you can make a pattern very easily by using one of the sections where Jack appears and replacing the numbers you want by capturing groups.
Example:
#Section250342,Main,First/HS/12345/Jack/M,2000 10.00
becomes,
#Section(\d ),Main,First/HS/12345/Jack/M,2000 (\d .\d{2})
If the section substring keeps the format but the other parts of it may change then just replace the rest like this:
#Section(\d ),\w ,(?:\w /)*Jack/M,\d (\d .\d{2})
I'm assuming that "Main" is a class, "First/HS/..." is a path and that the last value always has 2 and only 2 decimal places.
- \d - A digit: [0-9]
- \w - A word character: [a-zA-Z_0-9]
- - one or more times
- * - zero or more times
- {2} - exactly 2 times
- () - a capturing group
- (?:) - a non-capturing group
For reference see: https://docs.oracle.com/en/java/javase/18/docs/api/java.base/java/util/regex/Pattern.html
Simple Java example on how to get the values from the capturing groups using java.util.regex.Pattern and java.util.regex.Matcher
import java.util.regex.*;
public class GetMatch {
public static void main(String[] args) {
String s = "#Section250342,Main,First/HS/12345/Jack/M,2000 10.00,#Section250322,Main,First/HS/12345/Aaron/N,2000 17.00,#Section250399,Main,First/HS/12345/Jimmy/N,2000 12.00,#Section251234,Main,First/HS/12345/Jack/M,2000 11.00";
Pattern p = Pattern.compile("#Section(\\d ),\\w ,(?:\\w /)*Jack/M,\\d (\\d .\\d{2})");
Matcher m;
String[] tokens = s.split(",(?=#)"); //split the sections into different strings
for(String t : tokens) //checks every string that we got with the split
{
m = p.matcher(t);
if(m.matches()) //if the string matches the pattern then print the capturing groups
System.out.printf("Section: %s, Value: %s\n", m.group(1), m.group(2));
}
}
}