I'm migrating many bash shell scripts from old versions of raspbian and ubuntu to the current raspbian version. I found to my horror that the behavior of awk print and awk printf have changed in the latest version. How can I make awk work the way it used to so I don't have to find everywhere I used it, and so I can have the same script running on new and old versions of debian ?
I'm using awk print to split lines. Here's an example where I want to assign '5' to bar:
foo="value=5"
bar="$(echo "$foo" | tr "=" " " | awk '{print $2}')"
For most of my testing, I run:
foo="value=5"
echo "$foo" | tr "=" " " | awk '{print $2}'
so I can easily see if there's a newline or not, or any value at all or not, especially when I try print vs printf, and various format strings.
In the two old versions, this gave a usable '5'. In the new version, I get an unusable 5. That's it. That's the problem. Now every arithmetic statement and IF statement blows up, which is how I found the problem after much searching.
With the old versions, I got the same result for both print and printf (the latter without a format string). But in the new version, I get nothing when I use printf without a format string -- which is probably the way it should have been.
It gets worse. I found a format string, '"%c", $2' to use with printf in the current awk, but I have to use '"%s", $2 in the old version. Note '%c' vs '%s'. I can see whether there's a newline or not after the '5' depending on whether it is followed by a command prompt on the same line. I have to use different format strings in the two versions.
But that begs the question... why did awk print and awk printf change and how can I avoid having two versions of the same script to work on different versions of debian ?
As requested, I ran 'typeset -p bar ==> declare -- bar=5' on the three systems, but it always gives an error: -bash: typeset: bar: not found.
I researched how to find the awk version, and ran 'awk -W version' on all systems. THIS IS BAFFLING because I get the identical version, mawk 1.3.3 Nov 1996 on all three systems but I'm getting different results on my example. Is there some other factor ? Choice of keyboard or locale perhaps ? The latest OS is a brand new installation, although I set it for USA, English and a standard keyboard.
Avoiding three sub-processes by doing it another way as described by markp-fuso sure sounds like a winner, but I have no idea how the two commands that were given work.
Please note that I'm running bash not sh. I don't know if there's a difference between awk and mawk. I run the awk command.
Thank you for your help.
CodePudding user response:
#!/bin/sh -x
echo "value=5" | tr "=" "\n" > temp
echo "1,2p" | ed -s temp
I have come to view Ed as UNIX's answer to the lightsaber.
CodePudding user response:
I found a format string, '"%c", $2' to use with printf in the current awk, but I have to use '"%s", $2 in the old version. Note '%c' vs '%s'.
%c
behavior does depend on type of argument you feed - if it is numeric you will get character corresponding to given ASCII code, if it is string you will get first character of it, example
mawk 'BEGIN{printf "%c", 42}' emptyfile
does give output
*
and
mawk 'BEGIN{printf "%c", "HelloWorld"}' emptyfile
does give output
H
Apparently your 2nd field is digit and some junk characters, which is considered to be string, thus second option is used. But is taking first character correct action in all your use-cases? Is behavior compliant with requirement for multi-digit numbers, e.g. 555
?
(tested in mawk 1.3.3)