Using GNU Awk 5.0.0, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.1.2)
, I want to check for a pattern using match
.
My sample text is the following (with a space at the beginning of the line):
7 Plasmas Mobiles (30%)
Using the following regex, I am able to match the string:
[0-9]{1,} .{1,} \([0-9]{1,}%\)
As proved with this live example: regexr.com/6n3fh
However, awk's match
returns 0:
awk '{print match($0, " [0-9]{1,} .{1,} \([0-9]{1,}%\)")}' reports/test
awk: cmd. line:1: warning: escape sequence
\(' treated as plain
('awk: cmd. line:1: warning: escape sequence
\)' treated as plain
)'0
Why is that and how can I get the expected behavior, which is getting "1" as a return of match
?
CodePudding user response:
In awk
a regex is formed as /the-regex/
, see Regular Expressions. awk
does offer Dynamic Regexps where the regex is quoted as you have it.
awk
treats the two styles of regex differently. Specifically the double-quoted string is scanned twice by awk
. This necessitates escaping with a double backslash, e.g. \\
.
In your case you can either use:
match($0, / [0-9]{1,} .{1,} \([0-9]{1,}%\)/)
or
match($0, " [0-9]{1,} .{1,} \\([0-9]{1,}%\\)")
Example Use/Output
$ echo " 7 Plasmas Mobiles (30%)" | awk '{print match($0, / [0-9]{1,} .{1,} \([0-9]{1,}%\)/)}'
1
and
$ echo " 7 Plasmas Mobiles (30%)" | awk '{print match($0, " [0-9]{1,} .{1,} \\([0-9]{1,}%\\)")}'
1