Home > Mobile >  Capture word after pattern with slash
Capture word after pattern with slash

Time:03-11

I want to extract word1 from:

something /CLIENT_LOGIN:word1 something else

I would like to extract the first word after matching pattern /CLIENT_LOGIN:.

Without the slash, something like this works:

A=something /CLIENT_LOGIN:word1 something else
B=$(echo $A | awk '$1 == "CLIENT_LOGIN" { print $2 }' FS=":")

With the slash though, I can't get it working (I tried putting / and \/ in front of CLIENT_LOGIN). I don't care getting it done with awk, grep, sed, ...

CodePudding user response:

Using sed:

s='=something /CLIENT_LOGIN:word1 something else'
sed -E 's~.* /CLIENT_LOGIN:([^[:blank:]] ).*~\1~' <<< "$s"

word1

Details:

  • We use ~ as regex delimiter in sed
  • /CLIENT_LOGIN:([^[:blank:]] ) matches /CLIENT_LOGIN: followed by 1 non-whitespace characters that is captured in group #1
  • .* on both sides matches text before and after our match
  • \1 is used in substitution to put 1st group's captured value back in output

CodePudding user response:

1st solution: With your shown samples, please try following GNU grep solution.

grep -oP '^.*? /CLIENT_LOGIN:\K(\S )' Input_file

Explanation: Simple explanation would be, using GNU grep's o and P options. Which are responsible for printing exact match and enabling PCRE regex. In main program, using regex ^.*? /CLIENT_LOGIN:\K(\S ): which means using lazy match from starting of value to till /CLIENT_LOGIN: to match very first occurrence of string. Then using \K option to forget till now matched values so tat we can print only required values, which is followed by \S which means match all NON-Spaces before any space comes.



2nd solution: Using awk's match function along with its split function to print the required value.

awk '
match($0,/\/CLIENT_LOGIN:[^[:space:]] /){
  split(substr($0,RSTART,RLENGTH),arr,":")
  print arr[2]
}
' Input_file


3rd solution: Using GNU awk's FPAT option please try following solution. Simple explanation would be, setting FPAT to /CLIENT_LOGIN: followed by all non-spaces values. In main program of awk using sub to substitute everything till : with NULL for first field and then printing first field.

awk -v FPAT='/CLIENT_LOGIN:[^[:space:]] ' '{sub(/.*:/,"",$1);print $1}'  Input_file

CodePudding user response:

Performing a regex match and capturing the resulting string in BASH_REMATCH[]:

$ regex='.*/CLIENT_LOGIN:([^[:space:]]*).*'

$ A='something /CLIENT_LOGIN:word1 something else'
$ unset B

$ [[ "${A}" =~ $regex ]] && B="${BASH_REMATCH[1]}"
$ echo "${B}"
word1

Verifying B remains undefined if we don't find our match:

$ A='something without the desired string'
$ unset B

$ [[ "${A}" =~ $regex ]] && B="${BASH_REMATCH[1]}"
$ echo "${B}"
               <<<=== nothing output 

CodePudding user response:

Fixing your awk command, you can use

A="/CLIENT_IPADDR:23.4.28.2 /CLIENT_LOGIN:xdfmb1d /MXJ_C"
B=$(echo "$A" | awk 'match($0,/\/CLIENT_LOGIN:[^[:space:]] /){print substr($0,RSTART 14,RLENGTH-14)}')

See the online demo yielding xdfmb1d. Details:

  • \/CLIENT_LOGIN: - a /CLIENT_LOGIN: string
  • [^[:space:]] - one or more non-whitespace chars

The pattern above is what awk searches for, and once matched, the part of this match value after /CLIENT_LOGIN: is "extracted" using substr($0,RSTART 14,RLENGTH-14) (where 14 is the length of the /CLIENT_LOGIN: string).

  • Related