Home > Enterprise >  Preserving white space when using .split?
Preserving white space when using .split?

Time:12-11

I am trying to split a string and store this in an array, such that each word is stored at a separate index. I also want the white space and any punctuation to be stored.

eg. "Hello world!"

would be stored as:

array[0]: "hello"
array[1]: " "
array[2]: "world!"

I am currently using .split but can't figure out how to split the string so that it splits at the end of the word and stores the white space.

strArray = str.split(" ");

CodePudding user response:

Add space to every element except of the last one

String token=" ";
strArray = str.split(token);

for(int i=0;i<strArray.length - 2 ;i  ){
  strArray[i]=strArray[i] token
}

CodePudding user response:

Look at StringTokenizer.

The constructor StringTokenizer(String str, String delim, boolean returnDelim) is handy for this.

import java.util.*;
public class Test {
    public static void main (String[] args) {
        StringTokenizer st = new StringTokenizer(args[1], args[0], true);
        ArrayList<String> sa = new ArrayList<String>();
        while (st.hasMoreTokens())
            sa.add(st.nextToken());
        String[] array = sa.toArray(new String[sa.size()]);
        for (int i=0; i<array.length; i  )
            System.out.println(String.format("array[%d] = '%s'", i, array[i]));
    }
}

$ java Test " " "a test"
array[0] = 'a'
array[1] = ' '
array[2] = 'test'

$ java Test " " "a  test"
array[0] = 'a'
array[1] = ' '
array[2] = ' '
array[3] = 'test'

What you would do, probably, is test each element of the array to determine if it is or is not a delimiter, which is not difficult considering the knowledge of the delimiter argument to the constructor.

CodePudding user response:

You can also always split the string yourself.

import java.util.*;

class Split {
    public static void main(String... args) {
        var s = new Split();
        System.out.println(s.split(args[0]));
    }

    private List<String> split(String word) {
        // TODO: validate empty string
        var list = new ArrayList<String>();
        var sb = new StringBuilder();
        var inSpace = word.charAt(0) == ' ';

        for (char c : word.toCharArray()) {

            if (c == ' ' && inSpace || c != ' ' && !inSpace) {
                sb.append(c);
            }
            if (c == ' ' && !inSpace || c != ' ' && inSpace) {
                list.add(sb.toString());
                sb = new StringBuilder();
                sb.append(c);
                inSpace = !inSpace;
            }
        }

        list.add(sb.toString());
        return list;
    }
}

Output:

~/$ java Split "hello world" 
[hello,  , world]
~/$ java Split "hello         world"
[hello,          , world]
  • Related