There's a file to be processed, columns are separated by tabs:
$ cat system.log
2 camila create db
3 andrew create table
5 greg update table
6 nataly update view
7 greg delete table
9 camila update table
11 nataly create view
12 peter link table
14 andrew update view
15 greg update db
I wanted these lines displayed in a form:
Entry No. 7: camila (action: create db)
To do so, I created the following bash script:
#!/bin/bash
filename=$1
while read line; do
printf $line | awk -F '\t' '{ print "Entry No. ", $1, ": ", $2, " (action: ", $3, ")" }'
done < $filename
However, what I get is:
$ ./log_parser.sh system.log
Entry No. 2 : (action: )
Entry No. 3 : (action: )
Entry No. 5 : (action: )
Entry No. 6 : (action: )
Entry No. 7 : (action: )
Entry No. 9 : (action: )
Entry No. 11 : (action: )
Entry No. 12 : (action: )
Entry No. 14 : (action: )
Entry No. 15 : (action: )
Why only the first column gets processed and what to do to have the whole line processed?
CodePudding user response:
You must quote your variables to prevent word splitting. Consider if $line
evaluates to the string 2 camila create db
. In that case, printf $line
is equivalent to printf 2 camila create db
which calls printf
with 4 arguments. printf
correctly parses those arguments and dutifully writes the string 2
. If you want to pass a single argument to printf
, you could do printf "$line"
. But that is also incorrect, as the first argument to printf
should be a format string, and you don't want to use input strings as format strings. Instead, you should write printf '%s' "$line"
. But don't do that, either. while read; printf | awk
is an anti-pattern. Just use awk
to read the input.
CodePudding user response:
Failing to wrap $line
in double quotes causes the \t
characters to be replaced with spaces, which in turn screws up the awk -F'\t'
.
Consider:
$ line=$(head -1 system.log)
# double quoting ${line} maintains the \t characters:
$ echo "${line}" | od -c
0000000 2 \t c a m i l a \t c r e a t e
0000020 d b \n
0000023
# no (double) quoting of ${line} replaces the \t with spaces:
$ echo ${line} | od -c
0000000 2 c a m i l a c r e a t e
0000020 d b \n
0000023
The issue is further compounded by how printf
handles the unquoted ${line}
, eg:
$ printf ${line}
2
$ printf "${line}"
2 camila create db
As for the whole while
loop, and assuming the sole purpose of the while
loop is to send the modified file contents to stdout (ie, you're not using ${line}
for other bash-level operations), you could replace the whole thing with a single awk
call, eg:
$ awk -F '\t' '{ print "Entry No. ", $1, ": ", $2, " (action: ", $3, ")" }' system.log
Entry No. 2 : camila (action: create db )
Entry No. 3 : andrew (action: create table )
Entry No. 5 : greg (action: update table )
Entry No. 6 : nataly (action: update view )
Entry No. 7 : greg (action: delete table )
Entry No. 9 : camila (action: update table )
Entry No. 11 : nataly (action: create view )
Entry No. 12 : peter (action: link table )
Entry No. 14 : andrew (action: update view )
Entry No. 15 : greg (action: update db )
NOTE: the extra spaces in the output are due to how the print
command is being built; separating each argument with a ,
adds the default awk/OFS
delimiter (a space) between each argument; removing the comma (awk/OFS
delimiter) generates:
$ awk -F '\t' '{ print "Entry No. " $1 ": " $2 " (action: " $3 ")" }' system.log
Entry No. 2: camila (action: create db)
Entry No. 3: andrew (action: create table)
Entry No. 5: greg (action: update table)
Entry No. 6: nataly (action: update view)
Entry No. 7: greg (action: delete table)
Entry No. 9: camila (action: update table)
Entry No. 11: nataly (action: create view)
Entry No. 12: peter (action: link table)
Entry No. 14: andrew (action: update view)
Entry No. 15: greg (action: update db)