I have a string for example,
String s = "This is a String which needs to be split after every n words";
Suppose I have to divide this string after every 5 words of which the output should be,
Arraylist stringArr = ["This is a String which", "needs to be split after", "every n words"]
How can do this and store it in an array in java
CodePudding user response:
While there isn't a built-in way for Java to do this, it's fairly easy to do using Java's standard regular-expressions.
My example below tries to be clear, rather than trying to be the "best" way.
It's based on finding groups of five "words" followed by a space, based on the regular expression ([a-zA-Z] ){5})
which says
• [a-zA-Z]
find any letters, repeated (
)
•
followed by a space
• (...)
gather into groups
• {5}
exactly 5 times
You may want things besides letters, and you may want to allow multiple spaces or any whitespace, not just spaces, so later in the example I change the regex to (\\S \\s ){5}
where \S
means any non-whitespace and \s
means any whitespace.
This first goes through the process in the main
method, displaying output along the way that, I hope, makes it clear what's going on; then shows how the process could be made into a method.
I create a method that will split a line into groups of n words, then call it to split your string every 5 words then again but every 3 words.
Here it is:
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class LineSplitterExample
{
public static void main(String[] args)
{
String s = "This is a String which needs to be split after every n words";
//Pattern p = Pattern.compile("([a-zA-Z] ){5}");
Pattern p = Pattern.compile("(\\S ){5}");
Matcher m = p.matcher(s);
int last = 0;
List<String> collected = new ArrayList<>();
while (m.find()) {
System.out.println("Group Count = " m.groupCount());
for (int i=0; i<m.groupCount(); i ) {
final String found = m.group(i);
System.out.printf("Group %d: %s%n", i, found);
collected.add(found);
// keep track of where the last group ended
last = m.end();
System.out.println("'m.end()' is " last);
}
}
// collect the final part of the string after the last group
String tail = s.substring(last);
System.out.println(tail);
collected.add(tail);
String[] result = collected.toArray(new String[0]);
System.out.println("result:");
for (int n=0; n<result.length; n ) {
System.out.printf("-: %s%n", n, result[n]);
}
// Put a little space after the output
System.out.println("\n");
// Now use the methods...
String[] byFive = splitByWords(s, 5);
displayArray(byFive);
String[] byThree = splitByWords(s, 3);
displayArray(byThree);
}
private static String[] splitByWords(final String s, final int n)
{
//final Pattern p = Pattern.compile("([a-zA-Z] ){" n "}");
final Pattern p = Pattern.compile("(\\S \\s ){" n "}");
final Matcher m = p.matcher(s);
List<String> collected = new ArrayList<>();
int last = 0;
while (m.find()) {
for (int i=0; i<m.groupCount(); i ) {
collected.add(m.group(i));
last = m.end();
}
}
collected.add(s.substring(last));
return collected.toArray(new String[0]);
}
private static void displayArray(final String[] array)
{
System.out.println("Array:");
for (int i=0; i<array.length; i ) {
System.out.printf("-: %s%n", i, array[i]);
}
}
}
The output I got by running this is:
Group Count = 1
Group 0: This is a String which
'm.end()' is 23
Group Count = 1
Group 0: needs to be split after
'm.end()' is 47
every n words
result:
0: This is a String which
1: needs to be split after
2: every n words
Array:
0: This is a String which
1: needs to be split after
2: every n words
Array:
0: This is a
1: String which needs
2: to be split
3: after every n
4: words
CodePudding user response:
You can do it with a combination of replaceAll
and split
S{N}
- matchesN
iterations ofS
()
- regular expression capture group$1
- back reference to the captured group
Replace every occurrence of N
words with that occurrence followed by a special delimiter (in this case ###
). Then split on that delimiter.
public static String[] splitNWords(String s, int count) {
String delim = "((?:\\w \\s ){" count "})";
return s.replaceAll(delim, "$1###").split("###");
}
Demo
String s = "This is a String which needs to be split after every n words";
for (int i = 1; i < 5; i ) {
String[] arr = splitNWords(s, i);
System.out.println("Splitting on " i " words.");
for (String st : arr) {
System.out.println(st);
}
System.out.println();
}
prints
Splitting on 1 words.
This
is
a
String
which
needs
to
be
split
after
every
n
words
Splitting on 2 words.
This is
a String
which needs
to be
split after
every n
words
Splitting on 3 words.
This is a
String which needs
to be split
after every n
words
Splitting on 4 words.
This is a String
which needs to be
split after every n
words
CodePudding user response:
I dont think there is a split every n words. You need to specify a pattern, like blank space. You can for instance, Split every blank and later iterate over the array created and make another one with tue number of words you want.
Regards