I am trying to create a regex that will capture string between the square brackets, and if there is a number like (1234) then that should be excluded
I am using the regex
\[(.*?)\]
Suppose the sample data is
requests[45180], indices[movies]
In this case, I get the output as :
[45180]
[movies]
But my expected output is :
movies
Code:
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexMatcher {
private static String REGEX = "\\[(.*?)\\]";
private static String NUMBERS_REGEX = "\\d ";
private static List sampleData = Arrays.asList("test from [a.b.v1.2.0.71-0] to [a.b.v1.2.0.73-0]",
"requests[45180], indices[movies]");
public static void main(String[] args) {
Pattern pattern = Pattern.compile(REGEX);
Pattern numberPattern = Pattern.compile(NUMBERS_REGEX);
for (Object data : sampleData) {
List<String> indices = new ArrayList<>();
Matcher matcher = pattern.matcher(data.toString());
while (matcher.find()) {
String index = matcher.group().replaceAll("[\\[\\]'] ", "");
Matcher numberMatcher = numberPattern.matcher(index);
if (!numberMatcher.matches())
indices.add(index);
}
if (indices.size() > 0)
System.out.println("Indices: " indices);
}
}
}
Can anyone please help me resolve this issue?
CodePudding user response:
If your expected output is "movies", and you don't want to match digits in between the square brackets and not match empty strings, you can use a capture group:
\[([^\]\[\d] )]
\[
Match[
(
Capture group 1[^\]\[\d]
match 1 chars other than[
]
or a digit
)
Close group 1]
Match]
In Java:
String regex = "\\[([^\\]\\[\\d]*)\\]";
Example
String regex = "\\[([^\\]\\[\\d] )]";
String string = "requests[45180], indices[movies]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
movies
CodePudding user response:
You can use either of
\[([^\]\[]*[^\d\[\]][^\]\[]*)]
\[(?!\d ])([^\]\[]*)]
See the regex demo #1 and regex demo #2.
In Java, you will need to double escape the backslashes in the string literals:
private static String REGEX = "\\[[^\\]\\[]*[^\\d\\[\\]][^\\]\\[]*]";
private static String REGEX = "\\[(?!\\d ])[^\\]\\[]*]";
Details:
\[
- a[
char[^\]\[]*
- zero or more chars other than[
and]
[^\d\[\]]
- a char other than[
,]
and a digit[^\]\[]*
- zero or more chars other than[
and]
]
- a]
char.
And
\[
- a[
char(?!\d ])
- immediately to the right, there cannot be one or more digits followed with a]
char[^\]\[]*
- zero or more chars other than[
and]
]
- a]
char.
In your Java code, you probably can reduce the matching part with
while (matcher.find()) {
String index = matcher.group(1);
indices.add(index);
}
See the Java demo.