RegEx: Get multiline LDAP entry-CodePudding

I'm trying to capture an entire LDAP entry from dn:. to the entry's last line, but stopping at last line before next entry, e.g., \n#entry-id: 8266. My trial and error using egrep is getting absolutely nowhere. NOTE: I'm using exported ldif files where the data resides, fwiw.

Closest I've come is with egrep "dn: cn=name,ou=People,dc=example,dc=com. .|\n*. \n" but no output on terminal. I've tested the actual regex on regexr.com. I understand that is a completey different env.

Thanks in advance!

Sample Data:


dn: cn=name,ou=People,dc=example,dc=com \
shadowLastChange: 17492 \
userPassword: password \
sn: Last \
givenName: First \
cn: first \
mail: [email protected] \
displayName: First Last \
o: University \
ou: Dept. \
objectClass: top \
objectClass: person

\# entry-id: 8266

CodePudding user response：

If the data is always structured like that, and using awk is an option, you can use a range starting with dn: and ending with entry-id: and only print the lines that do not have entry-id:

awk '/^dn:/,/entry-id/ {
  if(!/entry-id:/){print}
}' file

Awk demo

CodePudding user response：

With your shown samples, please try following awk code.

awk '/entry-id/{found=""} /^dn:/{found=1} found' Input_file

OR in case you want to print only 1 set from dn: before entry-id: then try following code:

awk '/entry-id/{exit} /^dn:/{found=1} found' Input_file

CodePudding user response：

egrep uses extended regexp (equivalent to grep -E). Prefer grep -P (perl regexp) instead.
The -z flag makes your regex multiline:

grep -Pz "dn(\n|.)*?(\n\n|$)"

This matches with dn followed by any number of characters (or new line) until (? makes the preceding expression lazy instead of greedy) the next occurrence of \n\n or end of file ($)