My goal here is to extract character from such string
(B,C) (D,E) (A,B)
to String array [B,C,D,E,A,B]
with split
method.
Regex string that I have used is "\\(|\\)|,"
And I get a result with [B,C, ,D,E, ,A,B]
I have absolutely no idea why there are blank characters in the result array.
String[] tokens = str.split("\\s");
for(String token : tokens) {
token.charAt(1);
token.charAt(3);
}
This could be another answer, but I just want to know what I am missing from split
method and regex.
CodePudding user response:
What you want here is a regex find all. One straightforward approach is to use a formal Pattern
and Matcher
:
String input = "(B,C) (D,E) (A,B)";
Pattern p = Pattern.compile("\\(([A-Z] ),\\s*([A-Z] )\\)");
Matcher m = p.matcher(input);
List<String> matches = new ArrayList<>();
while (m.find()) {
matches.add(m.group(1));
matches.add(m.group(2));
}
System.out.println(matches); // [B, C, D, E, A, B]
Another approach, if you insist on using a string split, would be to first preprocess the string of tuples into a space separated list:
String input = "(B,C) (D,E) (A,B)";
input = input.replaceAll("\\(([A-Z] ),\\s*([A-Z] )\\)\\s*", "$1 $2 ");
String[] parts = input.split(" ");
System.out.println(Arrays.toString(parts)); // [B, C, D, E, A, B]
CodePudding user response:
As mentioned in comment that since there is a (
at the start and you are splitting on (
there will be an empty string at the start of resulting array.
You can use this Java 8 stream based code to get rid of empty strings:
String str = "(B,C) (D,E) (A,B)";
String[] tokens = Arrays.stream(str.split("[(),\\s] "))
.filter(s -> !s.isEmpty()).toArray(String[]::new);
//=> [ "B", "C", "D", "E", "A", "B" ]
CodePudding user response:
Try this.
public static void main(String[] args) {
String str = "(B,C) (D,E) (A,B)";
String[] tokens = str.replaceAll("[(),\\s] ", " ").trim().split(" ");
System.out.println(Arrays.toString(tokens));
}
output:
[B, C, D, E, A, B]