Home > Back-end >  AWK commands giving different results in different Ubuntu versions
AWK commands giving different results in different Ubuntu versions

Time:09-21

We are using below awk commands to split numbers and alphabets in a alphanumeric text.

echo "1.5GB" |awk '{ gsub(/([[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] )/,"&\n",$0) ; print "size="$1"\nsymbol="$2}'

This command gives desired result in Ubuntu 20.04. Result is

size=1.5
symbol=GB

But in Ubuntu 18.04 it gives below result,which is not a desired result

size=1.5GB
symbol=

CodePudding user response:

i can't replicate the issue - all my awk's outputs ended up with the same hashed value :

% echo "1.5GB" | nawk '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum 
1b0095d0c4c02859a61a0ab5a3253b58  stdin

% echo "1.5GB" | mawk '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58  stdin

% echo "1.5GB" | mawk2 '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58  stdin

% echo "1.5GB" | gawk -be '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58  stdin

% echo "1.5GB" | gawk -ne '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58  stdin

% echo "1.5GB" | gawk -ce '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58  stdin

% echo "1.5GB" | gawk -Pe '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58  stdin

CodePudding user response:

Although it is unclear what change in mawk 1.3.4 versus 1.3.3 made your code work, the code is logically flawed to begin with if the intent is to display the numeric portion of the input as size and the alphabetical portion as symbol even when one of the two components is missing, since the call to gsub makes whichever alphabetical or numeric characters it gets the first field. For example, if the input is just GB, your code will output:

size=GB
symbol=

which I don't think is desired.

A better approach is to remove the alphabetical portion from the input to make it size, and remove the numeric portion from the input to make it symbol:

awk '{s=$0;sub(/[[:alpha:]] /,"",s);sub(/[[:digit:].-] /,"");print"size="s"\nsymbol="$0}'
  • Related