We are using below awk
commands to split numbers and alphabets in a alphanumeric text.
echo "1.5GB" |awk '{ gsub(/([[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] )/,"&\n",$0) ; print "size="$1"\nsymbol="$2}'
This command gives desired result in Ubuntu 20.04
. Result is
size=1.5
symbol=GB
But in Ubuntu 18.04
it gives below result,which is not a desired result
size=1.5GB
symbol=
CodePudding user response:
i can't replicate the issue - all my awk
's outputs ended up with the same hashed value :
% echo "1.5GB" | nawk '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58 stdin
% echo "1.5GB" | mawk '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58 stdin
% echo "1.5GB" | mawk2 '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58 stdin
% echo "1.5GB" | gawk -be '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58 stdin
% echo "1.5GB" | gawk -ne '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58 stdin
% echo "1.5GB" | gawk -ce '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58 stdin
% echo "1.5GB" | gawk -Pe '{ print NR,NF,$0,$1,$NF; gsub(/[[:alpha:]] |[[:digit:].-] |[^[:alnum:].-] /,"&\n",$0) ; print NR,NF,$0,$1,$NF }' | xxh128sum
1b0095d0c4c02859a61a0ab5a3253b58 stdin
CodePudding user response:
Although it is unclear what change in mawk 1.3.4 versus 1.3.3 made your code work, the code is logically flawed to begin with if the intent is to display the numeric portion of the input as size
and the alphabetical portion as symbol
even when one of the two components is missing, since the call to gsub
makes whichever alphabetical or numeric characters it gets the first field. For example, if the input is just GB
, your code will output:
size=GB
symbol=
which I don't think is desired.
A better approach is to remove the alphabetical portion from the input to make it size
, and remove the numeric portion from the input to make it symbol
:
awk '{s=$0;sub(/[[:alpha:]] /,"",s);sub(/[[:digit:].-] /,"");print"size="s"\nsymbol="$0}'