Home > Enterprise >  Splitting a string in an array of strings of limited size
Splitting a string in an array of strings of limited size

Time:07-08

I have a string of a random address like

String s = "H.N.-13/1443 laal street near bharath dental lab near thana qutubsher near modern bakery saharanpur uttar pradesh 247001";

I want to split it into array of string with two conditions:

  • each element of that array of string is of length less than or equal to 20
  • No awkward ending of an element of array of string

For example, splitting every 20 characters would produce:

"H.N.-13/1443 laal st"
"reet near bharath de"
"ntal lab near thana"
"qutubsher near moder"
"n bakery saharanpur"

but the correct output would be:

"H.N.-13/1443 laal"
"street near bharath"
"dental lab near"
"thana qutubsher near"
"modern bakery"
"saharanpur"

Notice how each element in string array is less than or equal to 20.

The above is my output for this code:

static String[] split(String s,int max){
    int total_lines = s.length () / 24;
    if (s.length () % 24 != 0) {
        total_lines  ;
    }

    String[] ans = new String[total_lines];
    int count = 0;
    int j = 0;

    for (int i = 0; i < total_lines; i  ) {
        for (j = 0; j < 20; j  ) {
            if (ans[count] == null) {
                ans[count] = "";
            }

            if (count > 0) {
                if ((20 * count)   j < s.length()) {
                    ans[count]  = s.charAt (20 * count   j);
                } else {
                    break;
                }
            } else {
                ans[count]  = s.charAt (j);
            }
        }

        String a = "";

        a  = ans[count].charAt (0);

        if (a.equals (" ")) {
            ans[i] = ans[i].substring (0, 0)   ""   ans[i].substring (1);
        }

        System.out.println (ans[i]);

        count  ;
    }
    return ans;
}

public static void main (String[]args) {
    String add = "H.N.-13/1663 laal street near bharath dental lab near thana qutubsher near modern bakery";
    String city = "saharanpur";
    String state = "uttar pradesh";
    String zip = "247001";
    String s = add   " "   city   " "   state   " "   zip;
    String[]ans = split (s);
}

CodePudding user response:

Find all occurrences of up to 20 chars starting with a non-space and ending with a word boundary, and collect them to a List:

List<String> parts = Pattern.compile("\\S.{1,19}\\b").matcher(s)
  .results()
  .map(MatchResult::group)
  .collect(Collectors.toList());

See live demo.

CodePudding user response:

The code is not very clear, but at first glance it seems you are building character by character that is why you are getting the output you see. Instead you go word by word if you want to retain a word and overflow it to next String if necessary. A more promising code would be:

static String[] splitString(String s, int max) {
    String[] words = s.split("\s ");
    List<String> out = new ArrayList<>();
    int numWords = words.length;
    int i = 0;
    while (i <numWords) {
        int len = 0;
        StringBuilder sb = new StringBuilder();
        while (i < numWords && len < max) {
            int wordLength = words[i].length();
            len  = (i == numWords-1 ? wordLength : wordLength   1);//1 for space
            if (len <= max) {
                sb.append(words[i]  " ");
                i  ;
            }
        }
        out.add(sb.toString().trim());
    }
    return out.toArray(new String[] {});
        
}

Note: It works on your example input, but you may need to tweak it so it works for cases like a long word containing more than 20 characters, etc.

  • Related