Home > Software design >  Splitting string in java with multiple cases
Splitting string in java with multiple cases

Time:01-20

I need to extract values from a string and insert into a list. The below is the samples

String k1 = "Apple";
String k2 = "Apple Orange";
String k3 = "Apple (Banana, Orange, Grape)";
String k4 = "Apple Orange (Banana, Grape)";
String k5 = "Apple Orange (Banana Ice cream, Grape)"

At a time a word only can be present in the string, multiple words can be present in the string , multiple words with some words in the brackets also can be present in the string . In each case, need to extract the word and store in a list. Eg

The string has two parts , the comma separated list in the bracket can be (banana , orange , grape) or (banana ice cream, orange, grape cream) or whatever , provided they are always comma separated. The other strings whatever comes before the bracket string is a single word. Example 1- > Apple Orange (Banana Ice Cream, Grape) output is [Apple orange, Banana Ice Cream, Grape] . Example 2 -> Apple Orange output is [Apple Orange]

Eg

k1 -> [Apple] 
k2 -> [Apple Orange]
k3 -> [Apple,Banana,Orange,Grape]
k4 -> [Apple Orange,Banana,Grape]
k5 -> [Apple Orange,Banana Ice Cream, Grape]

Is there any way we can extract words like the above ?

CodePudding user response:

I would use a regular expression, for brevity I just assume that you use Java 9 or newer, and only have ASCII characters (no umlauts, etc) in your strings.

String k1 = "Apple";
String k2 = "Apple Orange";
String k3 = "Apple (Banana, Orange, Grape)";
String k4 = "Apple Orange (Banana, Grape)";

final String delimiter = "[^a-zA-Z] ";

// expected output: [Apple]
System.out.println(List.of(k1.split(delimiter)));

// expected output: [Apple, Orange]
System.out.println(List.of(k2.split(delimiter)));

// expected output: [Apple, Banana, Orange, Grape]
System.out.println(List.of(k3.split(delimiter)));

// expected output: [Apple, Orange, Banana, Grape]
System.out.println(List.of(k4.split(delimiter)));

If your words contain other letters than a-z and A-Z, use [^\\p{Alpha}] as delimiter, as suggested by @WJS. (I used the simpler form as it is easier to see what this does in my opinion.)

CodePudding user response:

So you want to extract all the words in a string and assign them to a list.

First of all, you need to take words as they are: no punctuation.

Here the method:

public static List<String> listOfWords(String str){

    // you need words, 
    // so everything that is not a word is used as delimiter

     String result [] = str.split("[^a-zA-Z] ");

    List<String> list = new ArrayList<String>();

    for (int i = 0; i < result.length; i  )
            list.add( result[i]);

    return list;

}

Bye!

  • Related