Bash script using awk doesn't read the entire line, just the first column-CodePudding

There's a file to be processed, columns are separated by tabs:

$ cat system.log
2       camila  create db
3       andrew  create table
5       greg    update table
6       nataly  update view
7       greg    delete table
9       camila  update table
11      nataly  create view
12      peter   link table
14      andrew  update view
15      greg    update db

I wanted these lines displayed in a form:

Entry No. 7: camila (action: create db)

To do so, I created the following bash script:

#!/bin/bash

filename=$1

while read line; do
        printf $line | awk -F '\t' '{ print "Entry No. ", $1, ": ", $2, " (action: ", $3, ")" }'
done < $filename

However, what I get is:

$ ./log_parser.sh system.log
Entry No.  2 :    (action:   )
Entry No.  3 :    (action:   )
Entry No.  5 :    (action:   )
Entry No.  6 :    (action:   )
Entry No.  7 :    (action:   )
Entry No.  9 :    (action:   )
Entry No.  11 :    (action:   )
Entry No.  12 :    (action:   )
Entry No.  14 :    (action:   )
Entry No.  15 :    (action:   )

Why only the first column gets processed and what to do to have the whole line processed?

CodePudding user response：

You must quote your variables to prevent word splitting. Consider if $line evaluates to the string 2 camila create db. In that case, printf $line is equivalent to printf 2 camila create db which calls printf with 4 arguments. printf correctly parses those arguments and dutifully writes the string 2. If you want to pass a single argument to printf, you could do printf "$line". But that is also incorrect, as the first argument to printf should be a format string, and you don't want to use input strings as format strings. Instead, you should write printf '%s' "$line". But don't do that, either. while read; printf | awk is an anti-pattern. Just use awk to read the input.

CodePudding user response：

Failing to wrap $line in double quotes causes the \t characters to be replaced with spaces, which in turn screws up the awk -F'\t'.

Consider:

$ line=$(head -1 system.log)

# double quoting ${line} maintains the \t characters:

$ echo "${line}" | od -c
0000000   2  \t   c   a   m   i   l   a  \t   c   r   e   a   t   e
0000020   d   b  \n
0000023

# no (double) quoting of ${line} replaces the \t with spaces:

$ echo ${line} | od -c
0000000   2       c   a   m   i   l   a       c   r   e   a   t   e
0000020   d   b  \n
0000023

The issue is further compounded by how printf handles the unquoted ${line}, eg:

$ printf ${line}
2

$ printf "${line}"
2       camila  create db

As for the whole while loop, and assuming the sole purpose of the while loop is to send the modified file contents to stdout (ie, you're not using ${line} for other bash-level operations), you could replace the whole thing with a single awk call, eg:

$ awk -F '\t' '{ print "Entry No. ", $1, ": ", $2, " (action: ", $3, ")" }' system.log
Entry No.  2 :  camila  (action:  create db )
Entry No.  3 :  andrew  (action:  create table )
Entry No.  5 :  greg  (action:  update table )
Entry No.  6 :  nataly  (action:  update view )
Entry No.  7 :  greg  (action:  delete table )
Entry No.  9 :  camila  (action:  update table )
Entry No.  11 :  nataly  (action:  create view )
Entry No.  12 :  peter  (action:  link table )
Entry No.  14 :  andrew  (action:  update view )
Entry No.  15 :  greg  (action:  update db )

NOTE: the extra spaces in the output are due to how the print command is being built; separating each argument with a , adds the default awk/OFS delimiter (a space) between each argument; removing the comma (awk/OFS delimiter) generates:

$ awk -F '\t' '{ print "Entry No. " $1 ": " $2 " (action: " $3 ")" }' system.log
Entry No. 2: camila (action: create db)
Entry No. 3: andrew (action: create table)
Entry No. 5: greg (action: update table)
Entry No. 6: nataly (action: update view)
Entry No. 7: greg (action: delete table)
Entry No. 9: camila (action: update table)
Entry No. 11: nataly (action: create view)
Entry No. 12: peter (action: link table)
Entry No. 14: andrew (action: update view)
Entry No. 15: greg (action: update db)