Home > Net >  Blank split string in java String.split
Blank split string in java String.split

Time:11-04

My goal here is to extract character from such string

(B,C) (D,E) (A,B)

to String array [B,C,D,E,A,B] with split method.

Regex string that I have used is "\\(|\\)|,"

And I get a result with [B,C, ,D,E, ,A,B]

I have absolutely no idea why there are blank characters in the result array.

String[] tokens = str.split("\\s");
for(String token : tokens) {
   token.charAt(1);
   token.charAt(3);
}

This could be another answer, but I just want to know what I am missing from split method and regex.

CodePudding user response:

What you want here is a regex find all. One straightforward approach is to use a formal Pattern and Matcher:

String input = "(B,C) (D,E) (A,B)";
Pattern p = Pattern.compile("\\(([A-Z] ),\\s*([A-Z] )\\)");
Matcher m = p.matcher(input);
List<String> matches = new ArrayList<>();

while (m.find()) {
    matches.add(m.group(1));
    matches.add(m.group(2));
}

System.out.println(matches);  // [B, C, D, E, A, B]

Another approach, if you insist on using a string split, would be to first preprocess the string of tuples into a space separated list:

String input = "(B,C) (D,E) (A,B)";
input = input.replaceAll("\\(([A-Z] ),\\s*([A-Z] )\\)\\s*", "$1 $2 ");
String[] parts = input.split(" ");
System.out.println(Arrays.toString(parts));  // [B, C, D, E, A, B]

CodePudding user response:

As mentioned in comment that since there is a ( at the start and you are splitting on ( there will be an empty string at the start of resulting array.

You can use this Java 8 stream based code to get rid of empty strings:

String str = "(B,C) (D,E) (A,B)";
String[] tokens = Arrays.stream(str.split("[(),\\s] "))
   .filter(s -> !s.isEmpty()).toArray(String[]::new);

//=> [ "B", "C", "D", "E", "A", "B" ]

Code Demo

CodePudding user response:

Try this.

public static void main(String[] args) {
    String str = "(B,C) (D,E) (A,B)";
    String[] tokens = str.replaceAll("[(),\\s] ", " ").trim().split(" ");
    System.out.println(Arrays.toString(tokens));
}

output:

[B, C, D, E, A, B]
  • Related