I feel like this is trivial but can't find any solution that works for me.
I have a string of this sort :
cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr
Where I need to to find the value between cn=
and ,ou=tged,ou=groupes,o=choregie,c=fr
, in this case I should only match doc_medical
first and doc_confidentiel
then.
I have this regex : (?=cn=)(.*?)(?<=,ou=tged,ou=groupes,o=choregie,c=fr)
but the problem is that it obviously matches everything after the second cn=
of the global string until the next ,ou=tged,ou=groupes,o=choregie,c=fr
. So my second group is wrong because it contains cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr
instead of only doc_confidentiel
.
I don't know the number of character there can be between the two strings, and I can't seem to figure out how to force the regex to match the first cn=
previous to the ,ou=tged,ou=groupes,o=choregie,c=fr
string instead of the first one it encounters after it.
CodePudding user response:
You can use
(?<=cn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr)
See the regex demo.
Details:
(?<=cn=)
- a location immediately preceded withcn=
[^,|]
- one or more chars other than|
and,
(?=,ou=tged,ou=groupes,o=choregie,c=fr)
- a positive lookahead that requires a,ou=tged,ou=groupes,o=choregie,c=fr
string to appear immediately to the right of the current location.
See the Java demo:
import java.util.*;
import java.util.regex.*;
class Test
{
public static void main (String[] args) throws java.lang.Exception
{
String regex = "(?<=cn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr)";
String string = "cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
}
}
Output:
doc_medical
doc_confidentiel
NOTE: If there is a value other than cn
that can contain more chars on the left, use a word boundary: (?<=\bcn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr)
. In Java, String regex = "(?<=\\bcn=)[^,|] (?=,ou=tged,ou=groupes,o=choregie,c=fr)";
.
CodePudding user response:
We can use a regex replacement approach here:
String input = "cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr";
String cn = input.replaceAll(".*\\bcn=([^,] ),ou=tged,ou=groupes,o=choregie,c=fr.*", "$1");
System.out.println(cn); // doc_confidentiel
Note that in your current regex pattern, which uses lookarounds, you seemed to be confusing lookbehinds with lookaheads. But, the approach I gave above doesn't even need lookarounds.
CodePudding user response:
You could use a capture group, and for example not cross matching a pipe |
char
\bcn=([^|]*),ou=tged,ou=groupes,o=choregie,c=fr\b
If it is the first value after the cn= then not matching a comma could also work:
\bcn=([^,]*),ou=tged,ou=groupes,o=choregie,c=fr\b
Explanation
\bcn=
Match the wordcn
and then =([^,]*)
Capture group 1, optionally match any char that you do not allow,ou=tged,ou=groupes,o=choregie,c=fr\b
Match the string
For example
String regex = "\\bcn=([^,]*),ou=tged,ou=groupes,o=choregie,c=fr\\b";
String string = "cn=doc_medical,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr|cn=doc_confidentiel,ou=tged,ou=groupes,o=choregie,c=fr|cn=test,ou=test,ou=test,o=choregie,c=fr";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
doc_medical
doc_confidentiel