I have a string that contains one or more (comma-separated) values, surrounded by quotes and enclosed in parentheses. So it can be of the type os IN ('WIN', 'MAC', 'LNU')
(for multiple values) or just os IN ('WIN')
for a single value.
I need to extract the values in a List
.
I have tried this regex, but it captures all the values into one single list element as one whole String as 'WIN', 'MAC'
, instead of two String values of WIN
and MAC
-
List<String> matchList = new ArrayList<>();
Pattern regex = Pattern.compile("\\((. ?)\\)");
Matcher regexMatcher = regex.matcher(processedFilterString);
while (regexMatcher.find()) {//Finds Matching Pattern in String
matchList.add(regexMatcher.group(1));//Fetching Group from String
}
Result:
Input: os IN ('WIN', 'MAC')
Output:
['WIN', 'MAC']
length: 1
In it's current form, the regex matches one or more characters surrounded by parentheses and captures them in a group, which is probably why the result is just one string. How can I adapt it to capture each of the values separately?
Edit - Just adding some more details. The input string can have multiple IN clauses containing other criteria, such as id IN ('xxxxxx') AND os IN ('WIN', 'MAC')
. Also, the length of the matched characters is not necessarily the same, so it could be - os IN ('WIN', 'MAC', 'LNUX')
.
CodePudding user response:
You may try splitting the CSV string from the IN
clause:
List<String> matchList = null;
Pattern regex = Pattern.compile("\\((. ?)\\)");
Matcher regexMatcher = regex.matcher(processedFilterString);
if (regexMatcher.find()) {
String match = regexMatcher.group(1).replaceAll("^'|'$", "");
String[] terms = match.split("'\\s*,\\s*'");
matchList = Arrays.stream(terms).collect(Collectors.toList());
}
Note that if your input string could contain multiple IN
clauses, then the above would need to be modified to use a while
loop.
CodePudding user response:
What I see from the examples in your question, your regular expression needs to find strings of at least three upper-case letters enclosed in single quotes.
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Solution {
public static void main(String[] args) {
String s = "os IN ('WIN', 'MAC', 'LNUX')";
Pattern pattern = Pattern.compile("'([A-Z]{3,})'");
Matcher matcher = pattern.matcher(s);
List<String> list = new ArrayList<>();
while (matcher.find()) {
list.add(matcher.group(1));
}
System.out.println(list);
}
}
Running the above code produces the following output:
[WIN, MAC, LNUX]