Home > Mobile >  Parse from end to start of a String to grab data before the third occurrence of a delimiter
Parse from end to start of a String to grab data before the third occurrence of a delimiter

Time:09-30

I am working on some strings and trying to parse through the data and retrieve a string that lies before the third occurrence of " - " from the end of the string. This data comes as a String from the DB and there is some text "-NONE----" that I would like to exclude while parsing.

Input (Below input is a String and not List)

String input1 = "-A123456-B987-013691-000-109264821"
String input2 = "-NONE----"
String input3 = "C1234567-A1241-EF-012361-000-18273460"

Output

String output1 = "-A123456-B987"
String output2= "-NONE----"
String output3 = "C1234567-A1241-EF"

Starting from the beginning of my string, I need to retrieve data before the third occurrence of
" - " (hyphen) is found, but I need to count " - " (hyphen) occurrence starting from end of string.

Any tips are appreciated.

CodePudding user response:

You could use a regex replacement approach:

String input = "-A123456-B987-013691-000-109264821";
String output = "([^-]*(?:-[^-] ){2}).*", "$1");
System.out.println(output);  // -A123456-B987

The regex pattern used here says to match:

  • ( open capture group
    • [^-]* match optional first term
    • (?:-[^-] ){2} then match - and a term, twice
  • ) close capture group, available as $1
  • .* consume the remainder of the string

CodePudding user response:

You could match the three dashes from behind with the $ symbol and then extract everything that is in front of that. I created two capture groups, where the first one is what you want to extract:

private static String extractFront(String input1) {
    Pattern pattern = Pattern.compile("(.*)(-[^-]*){3}$");
    Matcher matcher = pattern.matcher(input1);
    if (matcher.find()) {
        return matcher.group(1);
    }

    return null;
}

Main to test:

public static void main(String[] args) {
    String input1 = "-A123456-B987-013691-000-109264821";
    String input2 = "-NONE----";
    String input3 = "C1234567-A1241-EF-012361-000-18273460";

    System.out.println(extractFront(input1));
    System.out.println(extractFront(input2));
    System.out.println(extractFront(input3));
}

Output:

-A123456-B987
-NONE-
C1234567-A1241-EF

CodePudding user response:

We can use streams, lambdas, and predicate.

Split your input on its end-of-line character, to get an array of strings. We filter out the “NONE” lines.

For each line, we split into pieces, using the hyphen as a delimiter. This gives us an array of strings that we reassemble using only the 3 parts.

Lastly we collect into a list.

Here is some untested code to get you started.

String[] lines = input.split( "\n" ) ;
List < String > results = 
    Arrays
    .stream( lines ) 
    .filter( line -> ! line.contains( "-NONE-" )
    .map(
        line -> {
            String.join( 
                "-" ,
                Arrays.copyOf( line.split( "-" , 4 ) , 3 , String[].class )
            )
        }
    )
    .toList() 
 ;
  • Related