Home > front end >  How to grab string between square brackets and exclude if it is only a number
How to grab string between square brackets and exclude if it is only a number

Time:09-23

I am trying to create a regex that will capture string between the square brackets, and if there is a number like (1234) then that should be excluded

I am using the regex

\[(.*?)\]

Suppose the sample data is

requests[45180], indices[movies]

In this case, I get the output as :

[45180]
[movies]

But my expected output is :

movies

Code:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatcher {
    private static String REGEX = "\\[(.*?)\\]";
    private static String NUMBERS_REGEX = "\\d ";
    private static List sampleData = Arrays.asList("test from [a.b.v1.2.0.71-0] to [a.b.v1.2.0.73-0]",
            "requests[45180], indices[movies]");

    public static void main(String[] args) {
        Pattern pattern = Pattern.compile(REGEX);
        Pattern numberPattern = Pattern.compile(NUMBERS_REGEX);
        for (Object data : sampleData) {
            List<String> indices = new ArrayList<>();
            Matcher matcher = pattern.matcher(data.toString());

            while (matcher.find()) {
                String index = matcher.group().replaceAll("[\\[\\]'] ", "");
                Matcher numberMatcher = numberPattern.matcher(index);
                if (!numberMatcher.matches())
                    indices.add(index);
                
            }
            if (indices.size() > 0)
                System.out.println("Indices: "   indices);
        }
    }
}

Can anyone please help me resolve this issue?

CodePudding user response:

If your expected output is "movies", and you don't want to match digits in between the square brackets and not match empty strings, you can use a capture group:

\[([^\]\[\d] )]
  • \[ Match [
  • ( Capture group 1
    • [^\]\[\d] match 1 chars other than [ ] or a digit
  • ) Close group 1
  • ] Match ]

Regex demo | Java demo

In Java:

String regex = "\\[([^\\]\\[\\d]*)\\]";

Example

String regex = "\\[([^\\]\\[\\d] )]";
String string = "requests[45180], indices[movies]";

Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println(matcher.group(1));
}

Output

movies

CodePudding user response:

You can use either of

\[([^\]\[]*[^\d\[\]][^\]\[]*)]
\[(?!\d ])([^\]\[]*)]

See the regex demo #1 and regex demo #2.

In Java, you will need to double escape the backslashes in the string literals:

private static String REGEX = "\\[[^\\]\\[]*[^\\d\\[\\]][^\\]\\[]*]";
private static String REGEX = "\\[(?!\\d ])[^\\]\\[]*]";

Details:

  • \[ - a [ char
  • [^\]\[]* - zero or more chars other than [ and ]
  • [^\d\[\]] - a char other than [, ] and a digit
  • [^\]\[]* - zero or more chars other than [ and ]
  • ] - a ] char.

And

  • \[ - a [ char
  • (?!\d ]) - immediately to the right, there cannot be one or more digits followed with a ] char
  • [^\]\[]* - zero or more chars other than [ and ]
  • ] - a ] char.

In your Java code, you probably can reduce the matching part with

while (matcher.find()) {
    String index = matcher.group(1);
    indices.add(index);
}

See the Java demo.

  • Related