awk to parse the ldap data between two strings linux-CodePudding

Hi I want to get the strings between two string but in my case the first string like kdp2002 or kdp1005 this is not going to be constant for all entries across the output, that means the numbers after KDP and always changing and that KDP number don't want to be printed.

$ ldapsearch -x -LLL -o ldif-wrap=no  -b ou=Projects,ou=People,ou=KDI,o=KDP cn="alltest1p1" KDPHomeDirectory
dn: cn=alltest1p1,ou=Projects,ou=People,ou=KDI,o=KDP
KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=scratch,NisMap=KDP2002:/proj/KDP2002_alltest1p1_scratch_c/q,Quota=20000,Id=scratch_c
KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=economy,NisMap=KDP2002:/proj/KDP2002_alltest1p1/q,Quota=10000
KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=scratch,NisMap=KDP2002:/proj/KDP2002_alltest1p1_scratch/q,Quota=20000,Id=scratch
KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=scratch,NisMap=KDP2002:/proj/KDP2002_alltest1p1_scratch_a/q,Quota=20000,Id=scratch_a

Trial that works Partially:

$ ldapsearch -x -LLL -o ldif-wrap=no  -b ou=Projects,ou=People,ou=KDI,o=KDP cn="alltest1p1" KDPHomeDirectory |  grep -o -P '(?<=NisMap=).*(?=,Quota)'
KDP2002:/proj/KDP2002_alltest1p1/q
KDP2002:/proj/KDP2002_alltest1p1_scratch/q
KDP2002:/proj/KDP2002_alltest1p1_scratch_a/q

Expected output:

/proj/KDP2002_alltest1p1/q
/proj/KDP2002_alltest1p1_scratch/q
/proj/KDP2002_alltest1p1_scratch_a/q

CodePudding user response：

I would harness GNU sed for this task following way, let file.txt content be

KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=scratch,NisMap=KDP2002:/proj/KDP2002_alltest1p1_scratch_c/q,Quota=20000,Id=scratch_c
KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=economy,NisMap=KDP2002:/proj/KDP2002_alltest1p1/q,Quota=10000
KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=scratch,NisMap=KDP2002:/proj/KDP2002_alltest1p1_scratch/q,Quota=20000,Id=scratch
KDPHomeDirectory: nisMapName=auto.home,ou=KDI_US-CDC01,ou=Locations,ou=KDI,o=KDP#0#Quality=scratch,NisMap=KDP2002:/proj/KDP2002_alltest1p1_scratch_a/q,Quota=20000,Id=scratch_a

then

sed 's/.*KDP2002:\([^,]*\).*/\1/' file.txt

gives output

/proj/KDP2002_alltest1p1_scratch_c/q
/proj/KDP2002_alltest1p1/q
/proj/KDP2002_alltest1p1_scratch/q
/proj/KDP2002_alltest1p1_scratch_a/q

Explanation: I use single capturing group denoted by \( and \) which containg zero-or-more (*) non(^) ,, which is located after KDP2002: with whole replacement prefixed by .* and suffixed by .* to span whole line.

(tested in GNU sed 4.2.2)

CodePudding user response：

1st solution: With your shown samples only, please try following GNU awk code.

awk -v RS='=KDP[0-9] :([^,] )' 'RT{split(RT,arr,":");print arr[2]}' Input_file

2nd solution: With any awk version, using awk's match function, with your shown samples please try following code.

awk '
match($0,/=KDP[0-9] :([^,] )/){
  split(substr($0,RSTART,RLENGTH),arr,":")
  print arr[2]
}
'  Input_file

CodePudding user response：

Using gnu-grep you can use:

grep -oP '=KDP\d :\K[^,] '
/proj/KDP2002_alltest1p1_scratch_c/q
/proj/KDP2002_alltest1p1/q
/proj/KDP2002_alltest1p1_scratch/q
/proj/KDP2002_alltest1p1_scratch_a/q

Here \K resets/discards matched info to give you desired output after KDP\d : only.

Alternatively you can use this gnu-awk command:

awk 'match($0, /=KDP[0-9] :([^,] )/, a) {print a[1]}' file

/proj/KDP2002_alltest1p1_scratch_c/q
/proj/KDP2002_alltest1p1/q
/proj/KDP2002_alltest1p1_scratch/q
/proj/KDP2002_alltest1p1_scratch_a/q