Home > Blockchain >  Awk if-statement to count the number of characters (wc -m) coming from a pipe
Awk if-statement to count the number of characters (wc -m) coming from a pipe

Time:12-01

I tried to scratch my head around this issue and couldn't understand what it wrong about my one liner below.

Given that

echo "5" | wc -m 
2

and that

echo "55" | wc -m
3

I tried to add a zero in front of all numbers below 9 with an awk if-statement as follow:

echo "5" |  awk '{ if ( wc -m $0 -eq 2 ) print 0$1 ; else print $1 }'
05

which is "correct", however with 2 digits numbers I get the same zero in front.

echo "55" |  awk '{ if ( wc -m $0 -eq 2 ) print 0$1 ; else print $1 }'
055

How come? I assumed this was going to return only 55 instead of 055. I now understand I'm constructing the if-statement wrong.

What is then the right way (if it ever exists one) to ask awk to evaluate if whatever comes from the | has 2 characters as one would do with wc -m?

I'm not interested in the optimal way to add leading zeros in the command line (there are enough duplicates of that).

Thanks!

CodePudding user response:

I suggest to use printf:

printf "d\n" "$(echo 55 | wc -m)"
03
printf "d\n" "$(echo 123456789 | wc -m)"
10

Note: printf is available as a bash builtin. It mainly follows the conventions from the C function printf().. Check

help printf   # For the bash builtin in particular
man 3 printf  # For the C function

CodePudding user response:

Facts:

  • In AWK strings or variables are concatenated just by placing them side by side.
    For example: awk '{b="v" ; print "a" b}'
  • In AWK undefined variables are equal to an empty string or 0.
    For example: awk '{print a "b", -a}'
  • In AWK non-zero strings are true inside if.
    For example: awk '{ if ("a") print 1 }'

wc -m $0 -eq 2 is parsed as (i.e. - has more precedence then string concatenation):

wc -m $0 -eq 2
( wc - m ) ( $0 - eq ) 2
                       ^   - integer value 2, converted to string "2"
                  ^^       - undefined variable `eq`, converted to integer 0
             ^^            - input line, so string "5" converted to integer 5
                ^          - subtracts 5 - 0 = 5
            ^^^^^^^^^^^    - integer 5, converted to string "5"
       ^                   - undefined variable "m", converted to integer 0
  ^^                       - undefined variable "wc" converted to integer 0
^^^^^^^^^                  - subtracts 0 - 0 = 0, converted to a string "0"
^^^^^^^^^^^^^^^^^^^^^^^^^  - string concatenation, results in string "052"

The result of wc -m $0 -eq 2 is string 052 (see awk '{ print wc -m $0 -eq 2 }' <<<'5'). Because the string is not empty, if is always true.

It should return only 55 instead of 055

No, it should not.

Am I constructing the if statement wrong?

No, the if statement has valid AWK syntax. Your expectations to how it works do not match how it really works.

CodePudding user response:

To actually make it work (not that you would want to):

echo 5 | awk '
{
  cmd = "echo " $1 " | wc -m"
  cmd | getline len
  if (len == 2)
    print "0"$1
  else
    print $1
}'

But why when you can use this instead:

echo 5 | awk 'length($1) == 1 { $1 = "0"$1 } 1'

Or even simpler with the various printf solutions seen in the other answers.

  • Related