AWK to print after string match in file-CodePudding

i have written an awk command:

awk -F: '$1=="tag" {print $1;}' setup.py

I want to print everything after tag in quotes.

Sample input:

import tools

tools.setup(
    name='test',
    tag="0.0.8",
    packages=tools.ges(),
    line xyz
)

Output: 0.0.8

I was trying to output everything after tag but I can't even get that to work.

CodePudding user response：

1st solution: With your shown samples, please try following awk program. Using match function of awk to match regex tag="[^"]* which will match everything from tag=' to just before next occurrence of '. While printing it printing sub-string of matched part only and removing not needed part to get version part only.

awk 'match($0,/tag="[^"]*/){print substr($0,RSTART 5,RLENGTH-5)}' Input_file

2nd solution: Using GNU grep please try following. Using oP option in GNU grep where P option is responsible for enabling PCRE regex. In main program matching tag=" then using \K option to forget that matched value and matching just before next occurrence of " which will print the matched value then.

grep -oP 'tag="\K[^"]*' Input_file

3rd solution: Using GNU sed please try following solution. Using -E option of sed which will enable ERE(extended regular expression) in program. Using -n option to stop printing of lines until we explicitly mention to print. In main program substituting matched value which is everything till tag and creating 1st and only backreference here which has everything between "..." between 2 double quotes and while performing substitution substituting it with only backed reference value and using p option printing matched value.

sed -E -n 's/.*tag="([^"]*).*/\1/p' Input_file

CodePudding user response：

Using gnu awk, you could also match tag= with leading optional spaces from the start of the string and capture the tag version in a capture group.

ary[1] in the example code contains the group 1 value.

The pattern ^[[:blank:]]*tag="([0-9] (\.[0-9] )*)" matches:

^ Start of string
[[:blank:]]* Match optional spaces or tabs
tag=" Match literally
( Capture group 1
- [0-9] (\.[0-9] )* Match 1 digits and optionally repeat a . and 1 digits
) Close group 1
" Match the closing "

gawk example:

awk 'match($0, /^[[:blank:]]*tag="([0-9] (\.[0-9] )*)"/, ary) {print ary[1]}' setup.py

Output

0.0.8